[
https://issues.apache.org/jira/browse/SOLR-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818199#comment-16818199
]
ASF subversion and git services commented on SOLR-13336:
--------------------------------------------------------
Commit d90034f0d61cd1525e10d07cf064a8647dc08cc9 in lucene-solr's branch
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d90034f ]
SOLR-13336: add maxBooleanClauses (default to 1024) setting to solr.xml,
reverting previous effective value of Integer.MAX_VALUE-1, to restrict risk of
pathalogical query expansion.
> maxBooleanClauses ignored; can result in exponential expansion of naive
> queries
> -------------------------------------------------------------------------------
>
> Key: SOLR-13336
> URL: https://issues.apache.org/jira/browse/SOLR-13336
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: query parsers
> Affects Versions: 7.0, 7.6, master (9.0)
> Reporter: Michael Gibney
> Assignee: Hoss Man
> Priority: Major
> Attachments: SOLR-13336.patch, SOLR-13336.patch, SOLR-13336.patch
>
>
> Since SOLR-10921 it appears that Solr always sets
> {{BooleanQuery.maxClauseCount}} (at the Lucene level) to
> {{Integer.MAX_VALUE-1}}. I assume this is because Solr parses
> {{maxBooleanClauses}} out of the config and applies it externally.
> In any case, when used as part of
> {{lucene.util.QueryBuilder.analyzeGraphPhrase}} (and possibly other places?),
> the Lucene code checks internally against only the static {{maxClauseCount}}
> variable (permanently set to {{Integer.MAX_VALUE-1}} in the context of Solr).
> Thus in at least one case ({{analyzeGraphPhrase()}}, but possibly others?),
> {{maxBooleanClauses}} is having no effect. I'm pretty sure this is what's
> underlying the [issue reported here as being related to Solr
> 7.6|https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201902.mbox/%3CCAF%3DheHE6-MOtn2XRbEg7%3D1tpNEGtE8GaChnOhFLPeJzpF18SGA%40mail.gmail.com%3E].
> To summarize, users are definitely susceptible (to varying degrees of likely
> severity, assuming no actual _malicious_ attack) if:
> # Running Solr >= 7.6.0
> # Using edismax with "ps" param set to >0
> # Query-time analysis chain is _at all_ capable of producing graphs (e.g.,
> WordDelimiterGraphFilter, SynonymGraphFilter that has corresponding synonyms
> with varying token lengths.
> Users are _particularly_ vulnerable in practice if they have query-time
> {{WordDelimiterGraphFilter}} configured with {{preserveOriginal=true}}.
> To clarify, Lucene/Solr 7.6 didn't exactly _introduce_ the issue; it only
> increased the likelihood of problems manifesting (as a result of
> LUCENE-8531). Notably, the "enumerated strings" approach to graph phrase
> query (reintroduced by LUCENE-8531) was previously in place pre-6.5 – at
> which point it could rely on default Lucene-level {{maxClauseCount}} failsafe
> (removed as of 7.0). This explains the odd "Affects versions" =>
> maxBooleanClauses was disabled at the Lucene level (in Solr contexts)
> starting with version 7.0, but the change became more likely to manifest
> problems for users as of 7.6.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]