[jira] [Commented] (SOLR-13751) Add BooleanSimilarityFactory to Solr
[ https://issues.apache.org/jira/browse/SOLR-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176407#comment-17176407 ] Andy Webb commented on SOLR-13751: -- Great, thanks [~cpoerschke]! > Add BooleanSimilarityFactory to Solr > > > Key: SOLR-13751 > URL: https://issues.apache.org/jira/browse/SOLR-13751 > Project: Solr > Issue Type: New Feature >Reporter: Andy Webb >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.7 > > Time Spent: 40m > Remaining Estimate: 0h > > Solr doesn't expose Lucene's BooleanSimilarity (ref LUCENE-5867) so it's not > available for use in situations where BM25/TDF-IF are not useful. (Fields > using this similarity will likely also set omitNorms and > omitTermFreqAndPositions to true.) > Our use case is ngram-driven suggestions, where the frequency of occurrence > of a particular sequence of characters is not something users would expect to > be taken into account when ordering suggestions. > > Here's my PR: [https://github.com/apache/lucene-solr/pull/867] (I'm at > Activate if anyone would like to talk this through in person.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046804#comment-17046804 ] Andy Webb commented on SOLR-14252: -- No worries - thank you for picking this up! Andy > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.5 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > This PR prevents the NPE occurring: > [https://github.com/apache/lucene-solr/pull/1265] > (We've also noticed an error in the documentation - see > https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 > - this could be pulled out into a separate ticket if necessary?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040066#comment-17040066 ] Andy Webb commented on SOLR-14252: -- We have this patch working on a non-production system that previously suffered NPEs when the metrics reporter is configured to show e.g. {{CACHE\.searcher.*}} which contains both numeric and non-numeric metrics. The output below illustrates how the latter's aggregations are now zero: {noformat} ... "CACHE.searcher.statsCache.missingGlobalFieldStats": { "count": 2, "max": 280, "min": 0, "mean": 140, "stddev": 197.9898987322333, "sum": 280, "values": { "core_node6": { "value": 280, "updateCount": 320 }, "core_node4": { "value": 0, "updateCount": 318 } } }, ... "CACHE.searcher.statsCache.statsCacheImpl": { "count": 2, "max": 0, "min": 0, "mean": 0, "stddev": 0, "sum": 0, "values": { "core_node6": { "value": "LocalStatsCache", "updateCount": 320 }, "core_node4": { "value": "LocalStatsCache", "updateCount": 318 } } }, ... {noformat} > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Assignee: Andrzej Bialecki >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > This PR prevents the NPE occurring: > [https://github.com/apache/lucene-solr/pull/1265] > (We've also noticed an error in the documentation - see > https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 > - this could be pulled out into a separate ticket if necessary?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14252: - Status: Patch Available (was: Open) > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Assignee: Andrzej Bialecki >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > This PR prevents the NPE occurring: > [https://github.com/apache/lucene-solr/pull/1265] > (We've also noticed an error in the documentation - see > https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 > - this could be pulled out into a separate ticket if necessary?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14252: - Description: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. This PR prevents the NPE occurring: [https://github.com/apache/lucene-solr/pull/1265] (We've also noticed an error in the documentation - see https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 - this could be pulled out into a separate ticket if necessary?) was: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. This PR prevents the NPE occurring: [https://github.com/apache/lucene-solr/pull/1247] (We've also noticed an error in the documentation - see https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 - this could be pulled out into a separate ticket if necessary?) > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Assignee: Andrzej Bialecki >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > This PR prevents the NPE occurring: > [https://github.com/apache/lucene-solr/pull/1265] > (We've also noticed an error in the documentation - see > https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 > - this could be pulled out into a separate ticket if necessary?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14252: - Description: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. This PR prevents the NPE occurring: [https://github.com/apache/lucene-solr/pull/1247] (We've also noticed an error in the documentation - see https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 - this could be pulled out into a separate ticket if necessary?) was: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. This PR prevents the NPE occurring and triggers warnings instead: [https://github.com/apache/lucene-solr/pull/1247] We've seen it report {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have been hiding other issues with metrics gathering, which warrant further investigation. (We've also noticed an error in the documentation - see https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 - this could be pulled out into a separate ticket if necessary?) > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > This PR prevents the NPE occurring: > [https://github.com/apache/lucene-solr/pull/1247] > (We've also noticed an error in the documentation - see > https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 > - this could be pulled out into a separate ticket if necessary?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14252: - Description: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. This PR prevents the NPE occurring and triggers warnings instead: [https://github.com/apache/lucene-solr/pull/1247] We've seen it report {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have been hiding other issues with metrics gathering, which warrant further investigation. (We've also noticed an error in the documentation - see https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 - this could be pulled out into a separate ticket if necessary?) was: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. This PR prevents the NPE occurring and triggers warnings instead: [https://github.com/apache/lucene-solr/pull/1247] We've seen it report {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have been hiding other issues with metrics gathering, which warrant further investigation. > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > This PR prevents the NPE occurring and triggers warnings instead: > [https://github.com/apache/lucene-solr/pull/1247] > We've seen it report {{not a Number: false}} and {{not a Number: > LocalStatsCache}} - so the NPE may have been hiding other issues with metrics > gathering, which warrant further investigation. > (We've also noticed an error in the documentation - see > https://github.com/apache/lucene-solr/commit/109d3411cd3866d83273187170dbc5b8b3211d20 > - this could be pulled out into a separate ticket if necessary?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14252: - Description: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. This PR prevents the NPE occurring and triggers warnings instead: [https://github.com/apache/lucene-solr/pull/1247] We've seen it report {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have been hiding other issues with metrics gathering, which warrant further investigation. was: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. We've seen the values {{false}} PR: [https://github.com/apache/lucene-solr/pull/1247] The patch adds a warning for non-{{Number}} values - we've seen it report {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have been hiding other issues with metrics gathering, which warrant further investigation. > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > This PR prevents the NPE occurring and triggers warnings instead: > [https://github.com/apache/lucene-solr/pull/1247] > We've seen it report {{not a Number: false}} and {{not a Number: > LocalStatsCache}} - so the NPE may have been hiding other issues with metrics > gathering, which warrant further investigation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14252: - Description: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. We've seen the values {{false}} PR: [https://github.com/apache/lucene-solr/pull/1247] The patch adds a warning for non-{{Number}} values - we've seen it report {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have been hiding other issues with metrics gathering, which warrant further investigation. was: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. PR: [https://github.com/apache/lucene-solr/pull/1247] > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. We've seen the values > {{false}} > PR: [https://github.com/apache/lucene-solr/pull/1247] > The patch adds a warning for non-{{Number}} values - we've seen it report > {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE > may have been hiding other issues with metrics gathering, which warrant > further investigation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric
[ https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14252: - Description: The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. PR: [https://github.com/apache/lucene-solr/pull/1247] was:The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. > NullPointerException in AggregateMetric > --- > > Key: SOLR-14252 > URL: https://issues.apache.org/jira/browse/SOLR-14252 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The {{getMax}} and {{getMin}} methods in > [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] > can throw an NPE if non-{{Number}} values are present in {{values}}, when it > tries to cast a {{null}} {{Double}} to a {{double}}. > PR: [https://github.com/apache/lucene-solr/pull/1247] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14252) NullPointerException in AggregateMetric
Andy Webb created SOLR-14252: Summary: NullPointerException in AggregateMetric Key: SOLR-14252 URL: https://issues.apache.org/jira/browse/SOLR-14252 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: metrics Reporter: Andy Webb The {{getMax}} and {{getMin}} methods in [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java] can throw an NPE if non-{{Number}} values are present in {{values}}, when it tries to cast a {{null}} {{Double}} to a {{double}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14239) Fix the behavior of CaffeineCache.computeIfAbsent on branch_8x
[ https://issues.apache.org/jira/browse/SOLR-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030145#comment-17030145 ] Andy Webb commented on SOLR-14239: -- Thanks Andrzej! > Fix the behavior of CaffeineCache.computeIfAbsent on branch_8x > -- > > Key: SOLR-14239 > URL: https://issues.apache.org/jira/browse/SOLR-14239 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.4.1 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.5 > > > Spin-off from SOLR-13817, spotted by Andy Webb. > The back-port of {{CaffeineCache}} to branch_8x is missing a conditional > statement in {{computeIfAbsent}} which breaks the contract of this method, > i.e. if the loading function returns null the cache mapping should remain > unchanged and a null value should be returned. > Note: this only affects branch_8x, master already has this conditional. > This issue also modifies comments in the default {{solrconfig.xml}} to > indicate that the old cache implementations existing in branch_8x are > deprecated in favor of CaffeineCache. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13817) Deprecate and remove legacy SolrCache implementations
[ https://issues.apache.org/jira/browse/SOLR-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026661#comment-17026661 ] Andy Webb commented on SOLR-13817: -- Great, thanks Andrzej! > Deprecate and remove legacy SolrCache implementations > - > > Key: SOLR-13817 > URL: https://issues.apache.org/jira/browse/SOLR-13817 > Project: Solr > Issue Type: Improvement >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-13817-8x.patch, SOLR-13817-master.patch > > > Now that SOLR-8241 has been committed I propose to deprecate other cache > implementations in 8x and remove them altogether from 9.0, in order to reduce > confusion and maintenance costs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13817) Deprecate and remove legacy SolrCache implementations
[ https://issues.apache.org/jira/browse/SOLR-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025947#comment-17025947 ] Andy Webb commented on SOLR-13817: -- Oh, sorry - we're talking at cross-purposes. I didn't mean CaffeineCache 2.8.1 - I was referring to the version of {{org.apache.solr.search.CaffeineCache}} (with the null check) that's currently in master but not branch_8x. (I don't know the reason for the null check - is the version without it just as good to use?) Also while it's called out as a preferred implementation in [the docs|https://lucene.apache.org/solr/guide/8_4/query-settings-in-solrconfig.html], the [default solrconfig.xml in branch_8x|https://github.com/apache/lucene-solr/blob/branch_8x/solr/server/solr/configsets/_default/conf/solrconfig.xml] doesn't yet mention CaffeineCache as an option - I think it'd be good to replicate the deprecation warning in that config file. Andy > Deprecate and remove legacy SolrCache implementations > - > > Key: SOLR-13817 > URL: https://issues.apache.org/jira/browse/SOLR-13817 > Project: Solr > Issue Type: Improvement >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-13817-8x.patch, SOLR-13817-master.patch > > > Now that SOLR-8241 has been committed I propose to deprecate other cache > implementations in 8x and remove them altogether from 9.0, in order to reduce > confusion and maintenance costs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13817) Deprecate and remove legacy SolrCache implementations
[ https://issues.apache.org/jira/browse/SOLR-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025295#comment-17025295 ] Andy Webb commented on SOLR-13817: -- Could I put in a request that we get to use the final version of CaffeineCache in 8.5.0+ before the legacy cache implementations are removed in 9.0.0 please? Currently https://github.com/apache/lucene-solr/commit/b4fe911cc8e4bddff18226bc8c98a2deb735a8fc#diff-fc056ba10fcf92dc69fe32991cdad5f0 (in master) both updates CaffeineCache.java and removes FastLRUCache etc. thanks, Andy > Deprecate and remove legacy SolrCache implementations > - > > Key: SOLR-13817 > URL: https://issues.apache.org/jira/browse/SOLR-13817 > Project: Solr > Issue Type: Improvement >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-13817-8x.patch, SOLR-13817-master.patch > > > Now that SOLR-8241 has been committed I propose to deprecate other cache > implementations in 8x and remove them altogether from 9.0, in order to reduce > confusion and maintenance costs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14095) Replace Java serialization with Javabin in Overseer operations
[ https://issues.apache.org/jira/browse/SOLR-14095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024293#comment-17024293 ] Andy Webb commented on SOLR-14095: -- hi Tomas - no worries, I've put some further notes on SOLR-14219. Andy > Replace Java serialization with Javabin in Overseer operations > -- > > Key: SOLR-14095 > URL: https://issues.apache.org/jira/browse/SOLR-14095 > Project: Solr > Issue Type: Task >Reporter: Robert Muir >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: SOLR-14095-json.patch, json-nl.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Removing the use of serialization is greatly preferred. > But if serialization over the wire must really happen, then we must use JDK's > serialization filtering capability to prevent havoc. > https://docs.oracle.com/javase/10/core/serialization-filtering1.htm#JSCOR-GUID-3ECB288D-E5BD-4412-892F-E9BB11D4C98A -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14219) OverseerSolrResponse's serialVersionUID has changed
[ https://issues.apache.org/jira/browse/SOLR-14219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024249#comment-17024249 ] Andy Webb commented on SOLR-14219: -- We have a (non-production) cluster with three nodes - two are running 8.4.1, the third has 8.5.0 with {{useUnsafeOverseerResponse=true}}. This simulates the situation we'd have part-way through a rolling update in production. If one of the 8.4.1 nodes is the overseer, requesting {{/solr/admin/collections?action=overseerstatus}} on the other 8.4.1 node is OK, but on the 8.5.0 node it triggers an {{InvalidClassException}}. If the 8.5.0 node is the overseer, that request to either of the 8.4.1 nodes triggers the exception. With the fix in the PR in place to force the new version of OverseerSolrResponse to have the same serialVersionUID as before, all three nodes' overseers are compatible. (NB an 8.4.0 node would not be compatible with earlier or newer versions due to its SolrResponse having a different UID - that's what led to SOLR-14165.) > OverseerSolrResponse's serialVersionUID has changed > --- > > Key: SOLR-14219 > URL: https://issues.apache.org/jira/browse/SOLR-14219 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > When the {{useUnsafeOverseerResponse=true}} option introduced in SOLR-14095 > is used, the serialized OverseerSolrResponse has a different serialVersionUID > to earlier versions, making it backwards-incompatible. > https://github.com/apache/lucene-solr/pull/1210 forces the serialVersionUID > to its old value, so old and new nodes become compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14219) OverseerSolrResponse's serialVersionUID has changed
[ https://issues.apache.org/jira/browse/SOLR-14219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14219: - Status: Patch Available (was: Open) > OverseerSolrResponse's serialVersionUID has changed > --- > > Key: SOLR-14219 > URL: https://issues.apache.org/jira/browse/SOLR-14219 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > When the {{useUnsafeOverseerResponse=true}} option introduced in SOLR-14095 > is used, the serialized OverseerSolrResponse has a different serialVersionUID > to earlier versions, making it backwards-incompatible. > https://github.com/apache/lucene-solr/pull/1210 forces the serialVersionUID > to its old value, so old and new nodes become compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14095) Replace Java serialization with Javabin in Overseer operations
[ https://issues.apache.org/jira/browse/SOLR-14095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023866#comment-17023866 ] Andy Webb commented on SOLR-14095: -- hi, I've been experimenting with upgrading to 8.5.0+ in a prototyping environment using {{useUnsafeOverseerResponse=true}} and have found that a mixed pool of older/newer nodes gives the exception {{java.io.InvalidClassException: org.apache.solr.cloud.OverseerSolrResponse; local class incompatible: stream classdesc serialVersionUID = 4721653044098960880, local class serialVersionUID = -3791204262816422245}} (or vice-versa, depending on which node is the overseer). I've attached a PR to SOLR-14219 which I've found resolved this issue - please would someone review this? thanks, Andy > Replace Java serialization with Javabin in Overseer operations > -- > > Key: SOLR-14095 > URL: https://issues.apache.org/jira/browse/SOLR-14095 > Project: Solr > Issue Type: Task >Reporter: Robert Muir >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: SOLR-14095-json.patch, json-nl.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Removing the use of serialization is greatly preferred. > But if serialization over the wire must really happen, then we must use JDK's > serialization filtering capability to prevent havoc. > https://docs.oracle.com/javase/10/core/serialization-filtering1.htm#JSCOR-GUID-3ECB288D-E5BD-4412-892F-E9BB11D4C98A -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14219) OverseerSolrResponse's serialVersionUID has changed
[ https://issues.apache.org/jira/browse/SOLR-14219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14219: - Description: When the {{useUnsafeOverseerResponse=true}} option introduced in SOLR-14095 is used, the serialized OverseerSolrResponse has a different serialVersionUID to earlier versions, making it backwards-incompatible. https://github.com/apache/lucene-solr/pull/1210 forces the serialVersionUID to its old value, so old and new nodes become compatible. was: When the {{useUnsafeOverseerResponse=true}} option introduced in SOLR-14095 is used, the serialized OverseerSolrResponse has a different serialVersionUID to earlier versions, making it backwards-incompatible. (PR incoming) > OverseerSolrResponse's serialVersionUID has changed > --- > > Key: SOLR-14219 > URL: https://issues.apache.org/jira/browse/SOLR-14219 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > When the {{useUnsafeOverseerResponse=true}} option introduced in SOLR-14095 > is used, the serialized OverseerSolrResponse has a different serialVersionUID > to earlier versions, making it backwards-incompatible. > https://github.com/apache/lucene-solr/pull/1210 forces the serialVersionUID > to its old value, so old and new nodes become compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14219) OverseerSolrResponse's serialVersionUID has changed
Andy Webb created SOLR-14219: Summary: OverseerSolrResponse's serialVersionUID has changed Key: SOLR-14219 URL: https://issues.apache.org/jira/browse/SOLR-14219 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Reporter: Andy Webb When the {{useUnsafeOverseerResponse=true}} option introduced in SOLR-14095 is used, the serialized OverseerSolrResponse has a different serialVersionUID to earlier versions, making it backwards-incompatible. (PR incoming) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request
[ https://issues.apache.org/jira/browse/SOLR-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023821#comment-17023821 ] Andy Webb commented on SOLR-14189: -- Happy to help - thanks Uwe (and Christine)! > Some whitespace characters bypass zero-length test in query parsers leading > to 400 Bad Request > -- > > Key: SOLR-14189 > URL: https://issues.apache.org/jira/browse/SOLR-14189 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Andy Webb >Assignee: Uwe Schindler >Priority: Major > Fix For: master (9.0), 8.5 > > Time Spent: 2h > Remaining Estimate: 0h > > The edismax and some other query parsers treat pure whitespace queries as > empty queries, but they use Java's {{String.trim()}} method to normalise > queries. That method [only treats characters 0-32 as > whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. > Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - > which bypass the test and lead to {{400 Bad Request}} responses - see for > example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs > {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the > exception: > {noformat} > org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at > line 1, column 0. Was expecting one of: ... "+" ... "-" ... > ... "(" ... "*" ... ... ... ... ... > ... "[" ... "{" ... ... "filter(" ... ... > ... > {noformat} > [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, > edismax and rerank query parsers to use > [StringUtils.isWhitespace()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#isWhitespace-java.lang.String-] > which is aware of all whitespace characters. > Prior to the change, rerank behaves differently for U+3000 and U+0020 - with > the change, both the below give the "mandatory parameter" message: > {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%E3%80%80}} > - generic 400 Bad Request > {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%20}} > - 400 reporting "reRankQuery parameter is mandatory" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request
[ https://issues.apache.org/jira/browse/SOLR-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14189: - Description: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method [only treats characters 0-32 as whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, edismax and rerank query parsers to use [StringUtils.isWhitespace()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#isWhitespace-java.lang.String-] which is aware of all whitespace characters. Prior to the change, rerank behaves differently for U+3000 and U+0020 - with the change, both the below give the "mandatory parameter" message: {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%E3%80%80}} - generic 400 Bad Request {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%20}} - 400 reporting "reRankQuery parameter is mandatory" was: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method [only treats characters 0-32 as whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, edismax and rerank query parsers to use [StringUtils.stripToNull()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#stripToNull-java.lang.String-] which is aware of all whitespace characters. Prior to the change, rerank behaves differently for U+3000 and U+0020 - with the change, both the below give the "mandatory parameter" message: {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%E3%80%80}} - generic 400 Bad Request {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%20}} - 400 reporting "reRankQuery parameter is mandatory" > Some whitespace characters bypass zero-length test in query parsers leading > to 400 Bad Request > -- > > Key: SOLR-14189 > URL: https://issues.apache.org/jira/browse/SOLR-14189 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Andy Webb >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > The edismax and some other query parsers treat pure whitespace queries as > empty queries, but they use Java's {{String.trim()}} method to normalise > queries. That method [only treats characters 0-32 as > whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. > Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - > which bypass the test and lead to {{400 Bad Request}} responses - see for > example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs > {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the > exception: > {noformat} > org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at > line 1, column 0. Was expecting one of: ... "+" ... "-" ... > ... "(" ... "*" ... ... ... ... ... > ... "[" ... "{" ... ... "filter(" ... ... > ... > {noformat} > [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, > edismax and rerank query parsers to use > [S
[jira] [Updated] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request
[ https://issues.apache.org/jira/browse/SOLR-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14189: - Status: Patch Available (was: Open) > Some whitespace characters bypass zero-length test in query parsers leading > to 400 Bad Request > -- > > Key: SOLR-14189 > URL: https://issues.apache.org/jira/browse/SOLR-14189 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The edismax and some other query parsers treat pure whitespace queries as > empty queries, but they use Java's {{String.trim()}} method to normalise > queries. That method [only treats characters 0-32 as > whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. > Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - > which bypass the test and lead to {{400 Bad Request}} responses - see for > example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs > {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the > exception: > {noformat} > org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at > line 1, column 0. Was expecting one of: ... "+" ... "-" ... > ... "(" ... "*" ... ... ... ... ... > ... "[" ... "{" ... ... "filter(" ... ... > ... > {noformat} > [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, > edismax and rerank query parsers to use > [StringUtils.stripToNull()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#stripToNull-java.lang.String-] > which is aware of all whitespace characters. > Prior to the change, rerank behaves differently for U+3000 and U+0020 - with > the change, both the below give the "mandatory parameter" message: > {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%E3%80%80}} > - generic 400 Bad Request > {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%20}} > - 400 reporting "reRankQuery parameter is mandatory" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request
[ https://issues.apache.org/jira/browse/SOLR-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14189: - Description: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method [only treats characters 0-32 as whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, edismax and rerank query parsers to use [StringUtils.stripToNull()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#stripToNull-java.lang.String-] which is aware of all whitespace characters. Prior to the change, rerank behaves differently for U+3000 and U+0020 - with the change, both the below give the "mandatory parameter" message: {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%E3%80%80}} - generic 400 Bad Request {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reRankWeight=3}&rqq=%20}} - 400 reporting "reRankQuery parameter is mandatory" was: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method [only treats characters 0-32 as whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, edismax and rerank query parsers to use [StringUtils.stripToNull()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#stripToNull-java.lang.String-] which is aware of all whitespace characters. > Some whitespace characters bypass zero-length test in query parsers leading > to 400 Bad Request > -- > > Key: SOLR-14189 > URL: https://issues.apache.org/jira/browse/SOLR-14189 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The edismax and some other query parsers treat pure whitespace queries as > empty queries, but they use Java's {{String.trim()}} method to normalise > queries. That method [only treats characters 0-32 as > whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. > Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - > which bypass the test and lead to {{400 Bad Request}} responses - see for > example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs > {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the > exception: > {noformat} > org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at > line 1, column 0. Was expecting one of: ... "+" ... "-" ... > ... "(" ... "*" ... ... ... ... ... > ... "[" ... "{" ... ... "filter(" ... ... > ... > {noformat} > [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, > edismax and rerank query parsers to use > [StringUtils.stripToNull()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#stripToNull-java.lang.String-] > which is aware of all whitespace characters. > Prior to the change, rerank behaves differently for U+3000 and U+0020 - with > the change, both the below give the "mandatory parameter" message: > {{q=greetings&rq=\{!rerank%20reRankQuery=$rqq%20reRankDocs=1000%20reR
[jira] [Updated] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request
[ https://issues.apache.org/jira/browse/SOLR-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14189: - Description: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method [only treats characters 0-32 as whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, edismax and rerank query parsers to use [StringUtils.stripToNull()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#stripToNull-java.lang.String-] which is aware of all whitespace characters. was: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method only treats characters 0-32 as whitespace. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} PR: https://github.com/apache/lucene-solr/pull/1172 > Some whitespace characters bypass zero-length test in query parsers leading > to 400 Bad Request > -- > > Key: SOLR-14189 > URL: https://issues.apache.org/jira/browse/SOLR-14189 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The edismax and some other query parsers treat pure whitespace queries as > empty queries, but they use Java's {{String.trim()}} method to normalise > queries. That method [only treats characters 0-32 as > whitespace|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#trim--]. > Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - > which bypass the test and lead to {{400 Bad Request}} responses - see for > example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs > {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the > exception: > {noformat} > org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at > line 1, column 0. Was expecting one of: ... "+" ... "-" ... > ... "(" ... "*" ... ... ... ... ... > ... "[" ... "{" ... ... "filter(" ... ... > ... > {noformat} > [PR 1172|https://github.com/apache/lucene-solr/pull/1172] updates the dismax, > edismax and rerank query parsers to use > [StringUtils.stripToNull()|https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#stripToNull-java.lang.String-] > which is aware of all whitespace characters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request
[ https://issues.apache.org/jira/browse/SOLR-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14189: - Description: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method only treats characters 0-32 as whitespace. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} PR: https://github.com/apache/lucene-solr/pull/1172 was: The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method only treats characters 0-32 as whitespace. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} (PR incoming!) > Some whitespace characters bypass zero-length test in query parsers leading > to 400 Bad Request > -- > > Key: SOLR-14189 > URL: https://issues.apache.org/jira/browse/SOLR-14189 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The edismax and some other query parsers treat pure whitespace queries as > empty queries, but they use Java's {{String.trim()}} method to normalise > queries. That method only treats characters 0-32 as whitespace. Other > whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which > bypass the test and lead to {{400 Bad Request}} responses - see for example > {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs > {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the > exception: > {noformat} > org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at > line 1, column 0. Was expecting one of: ... "+" ... "-" ... > ... "(" ... "*" ... ... ... ... ... > ... "[" ... "{" ... ... "filter(" ... ... > ... > {noformat} > PR: https://github.com/apache/lucene-solr/pull/1172 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers
Andy Webb created SOLR-14189: Summary: Some whitespace characters bypass zero-length test in query parsers Key: SOLR-14189 URL: https://issues.apache.org/jira/browse/SOLR-14189 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: query parsers Reporter: Andy Webb The edismax and some other query parsers treat pure whitespace queries as empty queries, but they use Java's {{String.trim()}} method to normalise queries. That method only treats characters 0-32 as whitespace. Other whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which bypass the test and lead to {{400 Bad Request}} responses - see for example {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the exception: {noformat} org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at line 1, column 0. Was expecting one of: ... "+" ... "-" ... ... "(" ... "*" ... ... ... ... ... ... "[" ... "{" ... ... "filter(" ... ... ... {noformat} (PR incoming!) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14189) Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request
[ https://issues.apache.org/jira/browse/SOLR-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14189: - Summary: Some whitespace characters bypass zero-length test in query parsers leading to 400 Bad Request (was: Some whitespace characters bypass zero-length test in query parsers) > Some whitespace characters bypass zero-length test in query parsers leading > to 400 Bad Request > -- > > Key: SOLR-14189 > URL: https://issues.apache.org/jira/browse/SOLR-14189 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Andy Webb >Priority: Major > > The edismax and some other query parsers treat pure whitespace queries as > empty queries, but they use Java's {{String.trim()}} method to normalise > queries. That method only treats characters 0-32 as whitespace. Other > whitespace characters exist - such as {{U+3000 IDEOGRAPHIC SPACE}} - which > bypass the test and lead to {{400 Bad Request}} responses - see for example > {{/solr/mycollection/select?q=%E3%80%80&defType=edismax}} vs > {{/solr/mycollection/select?q=%20&defType=edismax}}. The first fails with the > exception: > {noformat} > org.apache.solr.search.SyntaxError: Cannot parse '': Encountered "" at > line 1, column 0. Was expecting one of: ... "+" ... "-" ... > ... "(" ... "*" ... ... ... ... ... > ... "[" ... "{" ... ... "filter(" ... ... > ... > {noformat} > (PR incoming!) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14165) SolrResponse serialVersionUID has changed
[ https://issues.apache.org/jira/browse/SOLR-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17011913#comment-17011913 ] Andy Webb commented on SOLR-14165: -- Thanks Noble! > SolrResponse serialVersionUID has changed > - > > Key: SOLR-14165 > URL: https://issues.apache.org/jira/browse/SOLR-14165 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.4 >Reporter: Andy Webb >Assignee: Noble Paul >Priority: Blocker > Fix For: 8.4.1 > > Time Spent: 40m > Remaining Estimate: 0h > > SOLR-13821 changed the signature of > {{org.apache.solr.client.solrj.SolrResponse}}, making serialisations of the > class incompatible between versions. > Original text from SOLR-13821: > {quote} > hi, > We've been experimenting with doing a rolling in-place upgrade from Solr > 8.3.1 to 8.4.0 on a non-production system, but have found that we get this > exception for some operations, including when requesting > /solr/admin/collections?action=overseerstatus on a node whose version is > inconsistent with the overseer: > java.io.InvalidClassException: org.apache.solr.client.solrj.SolrResponse; > local class incompatible: stream classdesc serialVersionUID = > -7931100103360242645, local class serialVersionUID = 2239939671435624715 > As far as I can see, this is due to the change to the SolrResponse class's > signature in commit e3bd5a7. My experimentation has shown that if the > serialVersionUID of that class is set explicitly to its previous value the > exception no longer occurs. > I'm not sure if this is a necessary or good fix, but I wanted to share this > issue with you in case it's something that you think needs resolving. > thanks, > Andy > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13821) Package Store
[ https://issues.apache.org/jira/browse/SOLR-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17011727#comment-17011727 ] Andy Webb commented on SOLR-13821: -- SOLR-14165 is still open - I don't know if folks want that to go into 8.4.1? As far as I can see attempting a rolling update of a SolrCloud cluster from earlier versions to 8.4.x will currently fail as nodes of different versions have incompatible SolrResponse classes. Andy > Package Store > - > > Key: SOLR-13821 > URL: https://issues.apache.org/jira/browse/SOLR-13821 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.4 > > Time Spent: 2h > Remaining Estimate: 0h > > Package store is a storage managed by Solr that holds the package artifacts. > This is replicated across nodes. > Design is here: > [https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad#] > The package store is powered by an underlying filestore. This filestore is a > fully replicated p2p filesystem storage for artifacts. > The APIs are as follows > {code:java} > # add a file > POST /api/cluster/files/path/to/file.jar > #retrieve a file > GET /api/cluster/files/path/to/file.jar > #list files in the /path/to directory > GET /api/cluster/files/path/to > #GET meta info of the jar > GET /api/cluster/files/path/to/file.jar?meta=true > {code} > This store keeps 2 files per file > # The actual file say {{myplugin.jar}} > # A metadata file {{.myplugin.jar.json}} in the same directory > The contenbts of the metadata file is > {code:json} > { > "sha512" : "" > "sig": { > "" :"" > }} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14165) SolrResponse serialVersionUID has changed
[ https://issues.apache.org/jira/browse/SOLR-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14165: - Status: Patch Available (was: Open) > SolrResponse serialVersionUID has changed > - > > Key: SOLR-14165 > URL: https://issues.apache.org/jira/browse/SOLR-14165 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Andy Webb >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > SOLR-13821 changed the signature of > {{org.apache.solr.client.solrj.SolrResponse}}, making serialisations of the > class incompatible between versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14165) SolrResponse serialVersionUID has changed
Andy Webb created SOLR-14165: Summary: SolrResponse serialVersionUID has changed Key: SOLR-14165 URL: https://issues.apache.org/jira/browse/SOLR-14165 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Andy Webb SOLR-13821 changed the signature of {{org.apache.solr.client.solrj.SolrResponse}}, making serialisations of the class incompatible between versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13821) Package Store
[ https://issues.apache.org/jira/browse/SOLR-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007772#comment-17007772 ] Andy Webb commented on SOLR-13821: -- Sure - I've raised SOLR-14165 and https://github.com/apache/lucene-solr/pull/1140 - will fill in missing details now. Andy > Package Store > - > > Key: SOLR-13821 > URL: https://issues.apache.org/jira/browse/SOLR-13821 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.4 > > Time Spent: 2h > Remaining Estimate: 0h > > Package store is a storage managed by Solr that holds the package artifacts. > This is replicated across nodes. > Design is here: > [https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad#] > The package store is powered by an underlying filestore. This filestore is a > fully replicated p2p filesystem storage for artifacts. > The APIs are as follows > {code:java} > # add a file > POST /api/cluster/files/path/to/file.jar > #retrieve a file > GET /api/cluster/files/path/to/file.jar > #list files in the /path/to directory > GET /api/cluster/files/path/to > #GET meta info of the jar > GET /api/cluster/files/path/to/file.jar?meta=true > {code} > This store keeps 2 files per file > # The actual file say {{myplugin.jar}} > # A metadata file {{.myplugin.jar.json}} in the same directory > The contenbts of the metadata file is > {code:json} > { > "sha512" : "" > "sig": { > "" :"" > }} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13821) Package Store
[ https://issues.apache.org/jira/browse/SOLR-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007763#comment-17007763 ] Andy Webb commented on SOLR-13821: -- hi, We've been experimenting with doing a rolling in-place upgrade from Solr 8.3.1 to 8.4.0 on a non-production system, but have found that we get this exception for some operations, including when requesting {{/solr/admin/collections?action=overseerstatus}} on a node whose version is inconsistent with the overseer: {{java.io.InvalidClassException: org.apache.solr.client.solrj.SolrResponse; local class incompatible: stream classdesc serialVersionUID = -7931100103360242645, local class serialVersionUID = 2239939671435624715}} As far as I can see, this is due to the change to the {{SolrResponse}} class's signature in [commit e3bd5a7|https://github.com/apache/lucene-solr/commit/e3bd5a7da271dcdbbd87cc6924982875791bd47d#diff-b809fa594f93aa6805381029a188e4e2L35]. My experimentation has shown that if the {{serialVersionUID}} of that class [is set explicitly to its previous value|https://github.com/apache/lucene-solr/compare/master...andywebb1975:SOLR-13821a] the exception no longer occurs. I'm not sure if this is a necessary or good fix, but I wanted to share this issue with you in case it's something that you think needs resolving. thanks, Andy > Package Store > - > > Key: SOLR-13821 > URL: https://issues.apache.org/jira/browse/SOLR-13821 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.4 > > Time Spent: 2h > Remaining Estimate: 0h > > Package store is a storage managed by Solr that holds the package artifacts. > This is replicated across nodes. > Design is here: > [https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad#] > The package store is powered by an underlying filestore. This filestore is a > fully replicated p2p filesystem storage for artifacts. > The APIs are as follows > {code:java} > # add a file > POST /api/cluster/files/path/to/file.jar > #retrieve a file > GET /api/cluster/files/path/to/file.jar > #list files in the /path/to directory > GET /api/cluster/files/path/to > #GET meta info of the jar > GET /api/cluster/files/path/to/file.jar?meta=true > {code} > This store keeps 2 files per file > # The actual file say {{myplugin.jar}} > # A metadata file {{.myplugin.jar.json}} in the same directory > The contenbts of the metadata file is > {code:json} > { > "sha512" : "" > "sig": { > "" :"" > }} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker
[ https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004296#comment-17004296 ] Andy Webb commented on SOLR-14131: -- That's great, thanks for your help with this Bruno! Andy > Add maxQueryLength option to DirectSolrSpellchecker > --- > > Key: SOLR-14131 > URL: https://issues.apache.org/jira/browse/SOLR-14131 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: search >Reporter: Andy Webb >Assignee: Bruno Roustant >Priority: Minor > Fix For: 8.5 > > Time Spent: 3.5h > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This > change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) > adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr > can be configured to not attempt to spellcheck terms over a specified length. > Here's a PR: https://github.com/apache/lucene-solr/pull/1113 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker
[ https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002368#comment-17002368 ] Andy Webb commented on SOLR-14131: -- hi Bruno - thanks for committing the Lucene change! I've linked [PR 1113|https://github.com/apache/lucene-solr/pull/1113] as the earlier one got messy - but with a colleague's help I think I've got a decent test for the change - let me know what you think. Andy > Add maxQueryLength option to DirectSolrSpellchecker > --- > > Key: SOLR-14131 > URL: https://issues.apache.org/jira/browse/SOLR-14131 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: search >Reporter: Andy Webb >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This > change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) > adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr > can be configured to not attempt to spellcheck terms over a specified length. > Here's a PR: https://github.com/apache/lucene-solr/pull/1113 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker
[ https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14131: - Status: Patch Available (was: Open) > Add maxQueryLength option to DirectSolrSpellchecker > --- > > Key: SOLR-14131 > URL: https://issues.apache.org/jira/browse/SOLR-14131 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: search >Reporter: Andy Webb >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This > change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) > adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr > can be configured to not attempt to spellcheck terms over a specified length. > Here's a PR: https://github.com/apache/lucene-solr/pull/1113 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker
[ https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14131: - Description: Attempting to spellcheck some long query terms can trigger org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr can be configured to not attempt to spellcheck terms over a specified length. Here's a PR: https://github.com/apache/lucene-solr/pull/1113 was: Attempting to spellcheck some long query terms can trigger org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr can be configured to not attempt to spellcheck terms over a specified length. Here's a draft PR: https://github.com/apache/lucene-solr/pull/1112 (I'm struggling writing tests, and we should update the Solr docs too.) > Add maxQueryLength option to DirectSolrSpellchecker > --- > > Key: SOLR-14131 > URL: https://issues.apache.org/jira/browse/SOLR-14131 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: search >Reporter: Andy Webb >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This > change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) > adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr > can be configured to not attempt to spellcheck terms over a specified length. > Here's a PR: https://github.com/apache/lucene-solr/pull/1113 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker
[ https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14131: - Description: Attempting to spellcheck some long query terms can trigger org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr can be configured to not attempt to spellcheck terms over a specified length. Here's a draft PR: https://github.com/apache/lucene-solr/pull/1112 (I'm struggling writing tests, and we should update the Solr docs too.) was:Attempting to spellcheck some long query terms can trigger org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a maxQueryLength option to DirectSolrSpellchecker so that Lucene/Solr can be configured to not attempt to spellcheck terms over a specified length. > Add maxQueryLength option to DirectSolrSpellchecker > --- > > Key: SOLR-14131 > URL: https://issues.apache.org/jira/browse/SOLR-14131 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: search >Reporter: Andy Webb >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This > change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) > adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr > can be configured to not attempt to spellcheck terms over a specified length. > Here's a draft PR: https://github.com/apache/lucene-solr/pull/1112 (I'm > struggling writing tests, and we should update the Solr docs too.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker
[ https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated LUCENE-9102: -- Status: Patch Available (was: Open) > Add maxQueryLength option to DirectSpellchecker > --- > > Key: LUCENE-9102 > URL: https://issues.apache.org/jira/browse/LUCENE-9102 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spellchecker >Reporter: Andy Webb >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This > change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option > to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to > spellcheck terms over a specified length. > PR: https://github.com/apache/lucene-solr/pull/1103 > Dependent Solr issue: SOLR-14131 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker
[ https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated LUCENE-9102: -- Description: Attempting to spellcheck some long query terms can trigger {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to spellcheck terms over a specified length. PR: https://github.com/apache/lucene-solr/pull/1103 was: Attempting to spellcheck some long query terms can trigger {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to spellcheck terms over a specified length. Draft PR: https://github.com/apache/lucene-solr/pull/1103 > Add maxQueryLength option to DirectSpellchecker > --- > > Key: LUCENE-9102 > URL: https://issues.apache.org/jira/browse/LUCENE-9102 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spellchecker >Reporter: Andy Webb >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This > change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option > to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to > spellcheck terms over a specified length. > PR: https://github.com/apache/lucene-solr/pull/1103 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker
[ https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated LUCENE-9102: -- Description: Attempting to spellcheck some long query terms can trigger {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to spellcheck terms over a specified length. PR: https://github.com/apache/lucene-solr/pull/1103 Dependent Solr issue: SOLR-14131 was: Attempting to spellcheck some long query terms can trigger {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to spellcheck terms over a specified length. PR: https://github.com/apache/lucene-solr/pull/1103 > Add maxQueryLength option to DirectSpellchecker > --- > > Key: LUCENE-9102 > URL: https://issues.apache.org/jira/browse/LUCENE-9102 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spellchecker >Reporter: Andy Webb >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This > change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option > to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to > spellcheck terms over a specified length. > PR: https://github.com/apache/lucene-solr/pull/1103 > Dependent Solr issue: SOLR-14131 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker
[ https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-14131: - Description: Attempting to spellcheck some long query terms can trigger org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a maxQueryLength option to DirectSolrSpellchecker so that Lucene/Solr can be configured to not attempt to spellcheck terms over a specified length. > Add maxQueryLength option to DirectSolrSpellchecker > --- > > Key: SOLR-14131 > URL: https://issues.apache.org/jira/browse/SOLR-14131 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: search >Reporter: Andy Webb >Priority: Minor > > Attempting to spellcheck some long query terms can trigger > org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This > change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) > adds a maxQueryLength option to DirectSolrSpellchecker so that Lucene/Solr > can be configured to not attempt to spellcheck terms over a specified length. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker
Andy Webb created SOLR-14131: Summary: Add maxQueryLength option to DirectSolrSpellchecker Key: SOLR-14131 URL: https://issues.apache.org/jira/browse/SOLR-14131 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: search Reporter: Andy Webb -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker
[ https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated LUCENE-9102: -- Description: Attempting to spellcheck some long query terms can trigger {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to spellcheck terms over a specified length. Draft PR: https://github.com/apache/lucene-solr/pull/1103 was: Attempting to spellcheck some long query terms can trigger {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to spellcheck terms over a specified length. (PR incoming) > Add maxQueryLength option to DirectSpellchecker > --- > > Key: LUCENE-9102 > URL: https://issues.apache.org/jira/browse/LUCENE-9102 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spellchecker >Reporter: Andy Webb >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Attempting to spellcheck some long query terms can trigger > {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This > change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option > to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to > spellcheck terms over a specified length. > Draft PR: https://github.com/apache/lucene-solr/pull/1103 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker
Andy Webb created LUCENE-9102: - Summary: Add maxQueryLength option to DirectSpellchecker Key: LUCENE-9102 URL: https://issues.apache.org/jira/browse/LUCENE-9102 Project: Lucene - Core Issue Type: Improvement Components: modules/spellchecker Reporter: Andy Webb Attempting to spellcheck some long query terms can trigger {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to spellcheck terms over a specified length. (PR incoming) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13190) Fuzzy search treated as server error instead of client error when terms are too complex
[ https://issues.apache.org/jira/browse/SOLR-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999393#comment-16999393 ] Andy Webb commented on SOLR-13190: -- Thanks Mike, I'll get on the case! > Fuzzy search treated as server error instead of client error when terms are > too complex > --- > > Key: SOLR-13190 > URL: https://issues.apache.org/jira/browse/SOLR-13190 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > We've seen a fuzzy search end up breaking the automaton and getting reported > as a server error. This usage should be improved by > 1) reporting as a client error, because it's similar to something like too > many boolean clauses queries in how an operator should deal with it > 2) report what field is causing the error, since that currently must be > deduced from adjacent query logs and can be difficult if there are multiple > terms in the search > This trigger was added to defend against adversarial regex but somehow hits > fuzzy terms as well, I don't understand enough about the automaton mechanisms > to really know how to approach a fix there, but improving the operability is > a good first step. > relevant stack trace: > {noformat} > org.apache.lucene.util.automaton.TooComplexToDeterminizeException: > Determinizing automaton with 13632 states and 21348 transitions would result > in more than 1 states. > at > org.apache.lucene.util.automaton.Operations.determinize(Operations.java:746) > at > org.apache.lucene.util.automaton.RunAutomaton.(RunAutomaton.java:69) > at > org.apache.lucene.util.automaton.ByteRunAutomaton.(ByteRunAutomaton.java:32) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:247) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:133) > at > org.apache.lucene.search.FuzzyTermsEnum.(FuzzyTermsEnum.java:143) > at org.apache.lucene.search.FuzzyQuery.getTermsEnum(FuzzyQuery.java:154) > at > org.apache.lucene.search.MultiTermQuery$RewriteMethod.getTermsEnum(MultiTermQuery.java:78) > at > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:58) > at > org.apache.lucene.search.TopTermsRewrite.rewrite(TopTermsRewrite.java:67) > at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:310) > at > org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:667) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:442) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:200) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1604) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:374) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13190) Fuzzy search treated as server error instead of client error when terms are too complex
[ https://issues.apache.org/jira/browse/SOLR-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999118#comment-16999118 ] Andy Webb edited comment on SOLR-13190 at 12/18/19 1:18 PM: Thanks Mike! I've [tried adding|https://github.com/apache/lucene-solr/pull/1098] a {{maxQueryLength}} option to {{Direct(Solr)SpellChecker}} which can be set to prevent long terms being spellchecked - it's a simple change, largely a cut-and-paste of the {{minQueryLength}}, and as far as I can see this would prevent us seeing the exceptions. It could default to 0, i.e. "no limit", to maintain the existing default behaviour unless it's deliberately set. Would this be a reasonable change to make to Lucene/Solr or do you think there might be a better approach? was (Author: andywebb1975): Thanks Mike! I've [tried adding|https://github.com/apache/lucene-solr/compare/master...andywebb1975:maxQueryLength] a {{maxQueryLength}} option to {{Direct(Solr)SpellChecker}} which can be set to prevent long terms being spellchecked - it's a simple change, largely a cut-and-paste of the {{minQueryLength}}, and as far as I can see this would prevent us seeing the exceptions. It could default to 0, i.e. "no limit", to maintain the existing default behaviour unless it's deliberately set. Would this be a reasonable change to make to Lucene/Solr or do you think there might be a better approach? > Fuzzy search treated as server error instead of client error when terms are > too complex > --- > > Key: SOLR-13190 > URL: https://issues.apache.org/jira/browse/SOLR-13190 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > We've seen a fuzzy search end up breaking the automaton and getting reported > as a server error. This usage should be improved by > 1) reporting as a client error, because it's similar to something like too > many boolean clauses queries in how an operator should deal with it > 2) report what field is causing the error, since that currently must be > deduced from adjacent query logs and can be difficult if there are multiple > terms in the search > This trigger was added to defend against adversarial regex but somehow hits > fuzzy terms as well, I don't understand enough about the automaton mechanisms > to really know how to approach a fix there, but improving the operability is > a good first step. > relevant stack trace: > {noformat} > org.apache.lucene.util.automaton.TooComplexToDeterminizeException: > Determinizing automaton with 13632 states and 21348 transitions would result > in more than 1 states. > at > org.apache.lucene.util.automaton.Operations.determinize(Operations.java:746) > at > org.apache.lucene.util.automaton.RunAutomaton.(RunAutomaton.java:69) > at > org.apache.lucene.util.automaton.ByteRunAutomaton.(ByteRunAutomaton.java:32) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:247) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:133) > at > org.apache.lucene.search.FuzzyTermsEnum.(FuzzyTermsEnum.java:143) > at org.apache.lucene.search.FuzzyQuery.getTermsEnum(FuzzyQuery.java:154) > at > org.apache.lucene.search.MultiTermQuery$RewriteMethod.getTermsEnum(MultiTermQuery.java:78) > at > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:58) > at > org.apache.lucene.search.TopTermsRewrite.rewrite(TopTermsRewrite.java:67) > at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:310) > at > org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:667) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:442) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:200) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1604) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:374) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.s
[jira] [Comment Edited] (SOLR-13190) Fuzzy search treated as server error instead of client error when terms are too complex
[ https://issues.apache.org/jira/browse/SOLR-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999118#comment-16999118 ] Andy Webb edited comment on SOLR-13190 at 12/18/19 1:11 PM: Thanks Mike! I've [tried adding|https://github.com/apache/lucene-solr/compare/master...andywebb1975:maxQueryLength] a {{maxQueryLength}} option to {{Direct(Solr)SpellChecker}} which can be set to prevent long terms being spellchecked - it's a simple change, largely a cut-and-paste of the {{minQueryLength}}, and as far as I can see this would prevent us seeing the exceptions. It could default to 0, i.e. "no limit", to maintain the existing default behaviour unless it's deliberately set. Would this be a reasonable change to make to Lucene/Solr or do you think there might be a better approach? was (Author: andywebb1975): Thanks Mike! I've tried adding a {{maxQueryLength}} option to {{Direct(Solr)SpellChecker}} which can be set to prevent long terms being spellchecked - it's a simple change, largely a cut-and-paste of the {{minQueryLength}}, and as far as I can see this would prevent us seeing the exceptions. It could default to 0, i.e. "no limit", to maintain the existing default behaviour unless it's deliberately set. Would this be a reasonable change to make to Lucene/Solr or do you think there might be a better approach? > Fuzzy search treated as server error instead of client error when terms are > too complex > --- > > Key: SOLR-13190 > URL: https://issues.apache.org/jira/browse/SOLR-13190 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > We've seen a fuzzy search end up breaking the automaton and getting reported > as a server error. This usage should be improved by > 1) reporting as a client error, because it's similar to something like too > many boolean clauses queries in how an operator should deal with it > 2) report what field is causing the error, since that currently must be > deduced from adjacent query logs and can be difficult if there are multiple > terms in the search > This trigger was added to defend against adversarial regex but somehow hits > fuzzy terms as well, I don't understand enough about the automaton mechanisms > to really know how to approach a fix there, but improving the operability is > a good first step. > relevant stack trace: > {noformat} > org.apache.lucene.util.automaton.TooComplexToDeterminizeException: > Determinizing automaton with 13632 states and 21348 transitions would result > in more than 1 states. > at > org.apache.lucene.util.automaton.Operations.determinize(Operations.java:746) > at > org.apache.lucene.util.automaton.RunAutomaton.(RunAutomaton.java:69) > at > org.apache.lucene.util.automaton.ByteRunAutomaton.(ByteRunAutomaton.java:32) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:247) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:133) > at > org.apache.lucene.search.FuzzyTermsEnum.(FuzzyTermsEnum.java:143) > at org.apache.lucene.search.FuzzyQuery.getTermsEnum(FuzzyQuery.java:154) > at > org.apache.lucene.search.MultiTermQuery$RewriteMethod.getTermsEnum(MultiTermQuery.java:78) > at > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:58) > at > org.apache.lucene.search.TopTermsRewrite.rewrite(TopTermsRewrite.java:67) > at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:310) > at > org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:667) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:442) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:200) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1604) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:374) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559) > {nof
[jira] [Commented] (SOLR-13190) Fuzzy search treated as server error instead of client error when terms are too complex
[ https://issues.apache.org/jira/browse/SOLR-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999118#comment-16999118 ] Andy Webb commented on SOLR-13190: -- Thanks Mike! I've tried adding a {{maxQueryLength}} option to {{Direct(Solr)SpellChecker}} which can be set to prevent long terms being spellchecked - it's a simple change, largely a cut-and-paste of the {{minQueryLength}}, and as far as I can see this would prevent us seeing the exceptions. It could default to 0, i.e. "no limit", to maintain the existing default behaviour unless it's deliberately set. Would this be a reasonable change to make to Lucene/Solr or do you think there might be a better approach? > Fuzzy search treated as server error instead of client error when terms are > too complex > --- > > Key: SOLR-13190 > URL: https://issues.apache.org/jira/browse/SOLR-13190 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > We've seen a fuzzy search end up breaking the automaton and getting reported > as a server error. This usage should be improved by > 1) reporting as a client error, because it's similar to something like too > many boolean clauses queries in how an operator should deal with it > 2) report what field is causing the error, since that currently must be > deduced from adjacent query logs and can be difficult if there are multiple > terms in the search > This trigger was added to defend against adversarial regex but somehow hits > fuzzy terms as well, I don't understand enough about the automaton mechanisms > to really know how to approach a fix there, but improving the operability is > a good first step. > relevant stack trace: > {noformat} > org.apache.lucene.util.automaton.TooComplexToDeterminizeException: > Determinizing automaton with 13632 states and 21348 transitions would result > in more than 1 states. > at > org.apache.lucene.util.automaton.Operations.determinize(Operations.java:746) > at > org.apache.lucene.util.automaton.RunAutomaton.(RunAutomaton.java:69) > at > org.apache.lucene.util.automaton.ByteRunAutomaton.(ByteRunAutomaton.java:32) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:247) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:133) > at > org.apache.lucene.search.FuzzyTermsEnum.(FuzzyTermsEnum.java:143) > at org.apache.lucene.search.FuzzyQuery.getTermsEnum(FuzzyQuery.java:154) > at > org.apache.lucene.search.MultiTermQuery$RewriteMethod.getTermsEnum(MultiTermQuery.java:78) > at > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:58) > at > org.apache.lucene.search.TopTermsRewrite.rewrite(TopTermsRewrite.java:67) > at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:310) > at > org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:667) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:442) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:200) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1604) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:374) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13190) Fuzzy search treated as server error instead of client error when terms are too complex
[ https://issues.apache.org/jira/browse/SOLR-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998537#comment-16998537 ] Andy Webb commented on SOLR-13190: -- (Obviously we don't expect to receive a useful correction for such queries!) > Fuzzy search treated as server error instead of client error when terms are > too complex > --- > > Key: SOLR-13190 > URL: https://issues.apache.org/jira/browse/SOLR-13190 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We've seen a fuzzy search end up breaking the automaton and getting reported > as a server error. This usage should be improved by > 1) reporting as a client error, because it's similar to something like too > many boolean clauses queries in how an operator should deal with it > 2) report what field is causing the error, since that currently must be > deduced from adjacent query logs and can be difficult if there are multiple > terms in the search > This trigger was added to defend against adversarial regex but somehow hits > fuzzy terms as well, I don't understand enough about the automaton mechanisms > to really know how to approach a fix there, but improving the operability is > a good first step. > relevant stack trace: > {noformat} > org.apache.lucene.util.automaton.TooComplexToDeterminizeException: > Determinizing automaton with 13632 states and 21348 transitions would result > in more than 1 states. > at > org.apache.lucene.util.automaton.Operations.determinize(Operations.java:746) > at > org.apache.lucene.util.automaton.RunAutomaton.(RunAutomaton.java:69) > at > org.apache.lucene.util.automaton.ByteRunAutomaton.(ByteRunAutomaton.java:32) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:247) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:133) > at > org.apache.lucene.search.FuzzyTermsEnum.(FuzzyTermsEnum.java:143) > at org.apache.lucene.search.FuzzyQuery.getTermsEnum(FuzzyQuery.java:154) > at > org.apache.lucene.search.MultiTermQuery$RewriteMethod.getTermsEnum(MultiTermQuery.java:78) > at > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:58) > at > org.apache.lucene.search.TopTermsRewrite.rewrite(TopTermsRewrite.java:67) > at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:310) > at > org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:667) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:442) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:200) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1604) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:374) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13190) Fuzzy search treated as server error instead of client error when terms are too complex
[ https://issues.apache.org/jira/browse/SOLR-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998533#comment-16998533 ] Andy Webb commented on SOLR-13190: -- hi, we've seeing {{TooComplexToDeterminizeException}} in production - our current mitigation is to avoid asking Solr to spellcheck long queries. The exception can be triggered in 8.3.0 as follows: # create a collection (e.g. {{default}}) using the {{_default}} config # add a single document with some random content in the {{\_text_}} field # send a spellcheck request such as {{/solr/default/spell?q=kjshgkjahdskjgadhsgkahsdkgskd%C4%A3shdjghaksdhdhdkadhgkjahsdkjgahskdghjjhgkasjdhgajhdskgjahsdgkahjsdkjghaksd}} The presence of a multi-byte character seems to matter - without it the query can be several times longer before a StackOverflowError is thrown instead. Would you expect the PR on this ticket to resolve this? If so, we'd be very keen to see it merged in please. (I'll try spinning up a custom build Solr with the patch applied to test this myself.) thanks, Andy > Fuzzy search treated as server error instead of client error when terms are > too complex > --- > > Key: SOLR-13190 > URL: https://issues.apache.org/jira/browse/SOLR-13190 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We've seen a fuzzy search end up breaking the automaton and getting reported > as a server error. This usage should be improved by > 1) reporting as a client error, because it's similar to something like too > many boolean clauses queries in how an operator should deal with it > 2) report what field is causing the error, since that currently must be > deduced from adjacent query logs and can be difficult if there are multiple > terms in the search > This trigger was added to defend against adversarial regex but somehow hits > fuzzy terms as well, I don't understand enough about the automaton mechanisms > to really know how to approach a fix there, but improving the operability is > a good first step. > relevant stack trace: > {noformat} > org.apache.lucene.util.automaton.TooComplexToDeterminizeException: > Determinizing automaton with 13632 states and 21348 transitions would result > in more than 1 states. > at > org.apache.lucene.util.automaton.Operations.determinize(Operations.java:746) > at > org.apache.lucene.util.automaton.RunAutomaton.(RunAutomaton.java:69) > at > org.apache.lucene.util.automaton.ByteRunAutomaton.(ByteRunAutomaton.java:32) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:247) > at > org.apache.lucene.util.automaton.CompiledAutomaton.(CompiledAutomaton.java:133) > at > org.apache.lucene.search.FuzzyTermsEnum.(FuzzyTermsEnum.java:143) > at org.apache.lucene.search.FuzzyQuery.getTermsEnum(FuzzyQuery.java:154) > at > org.apache.lucene.search.MultiTermQuery$RewriteMethod.getTermsEnum(MultiTermQuery.java:78) > at > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:58) > at > org.apache.lucene.search.TopTermsRewrite.rewrite(TopTermsRewrite.java:67) > at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:310) > at > org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:667) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:442) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:200) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1604) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:374) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org