Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-08 Thread via GitHub


gerlowskija merged PR #2395:
URL: https://github.com/apache/solr/pull/2395


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-08 Thread via GitHub


dsmiley commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2100767344

   Cool; please remember to mention me in CHANGES.txt.  And thanks for adding 
this protection.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-08 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2100578628

   Thanks for the change David - apologies for the back and forth trying to 
understand what you were suggesting.  I've merged your PR into this branch.
   
   With that wrapped up, I'm going to re-run tests/check/precommit and aim to 
merge this afternoon.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-08 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1594022421


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-08 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1594022421


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-08 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1594016753


##
solr/core/src/test-files/solr/configsets/cloud-minimal-field-limiting/conf/solrconfig.xml:
##
@@ -0,0 +1,73 @@
+
+
+
+
+
+
+
+
+  ${solr.data.dir:}
+
+  
+  
+
+  ${tests.luceneMatchVersion:LATEST}
+
+  
+
+  ${solr.commitwithin.softcommit:true}
+
+
+  
+
+  
+
+  explicit
+  true
+  text
+
+  
+
+  
+${solr.test.maxFields:1234}
+  

Review Comment:
   I've updated this to be in line with your request.
   
   Though I'm still curious if there's a reason to "prefer" this going forward, 
or whether it's subjective.  I've not paid much attention historically to how 
these appear in our configsets, but naively I'd assume that more composability 
is better.



##
solr/core/src/test-files/solr/configsets/cloud-minimal-field-limiting/conf/solrconfig.xml:
##
@@ -0,0 +1,73 @@
+
+
+
+
+
+
+
+
+  ${solr.data.dir:}
+
+  
+  
+
+  ${tests.luceneMatchVersion:LATEST}
+
+  
+
+  ${solr.commitwithin.softcommit:true}
+
+
+  
+
+  
+
+  explicit
+  true
+  text
+
+  
+
+  
+${solr.test.maxFields:1234}
+  

Review Comment:
   I've updated this to be in line with your request.
   
   Though I'm still curious if there's a reason to prefer this going forward, 
or whether it's subjective.  I've not paid much attention historically to how 
these appear in our configsets, but naively I'd assume that more composability 
is better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-06 Thread via GitHub


dsmiley commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2097012109

   See https://github.com/gerlowskija/solr/pull/5 for a commit that makes a 
large simplification:
   * No NumFieldsMonitor
   * No named URP; anonymous class is fine (and less code)
   * URP need not be inserted in the happy path!
   
   The single line I refer to earlier is as follows in my PR to your PR:
   `final int currentNumFields = req.getSearcher().getFieldInfos().size();`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-06 Thread via GitHub


dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1591513535


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-06 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1591039849


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-06 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1591064705


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessor.java:
##
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Locale;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NumFieldLimitingUpdateRequestProcessor extends 
UpdateRequestProcessor {

Review Comment:
   Missed this earlier; done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-06 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1591039849


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1589670077


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1589657516


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1589647578


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldsMonitor.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.core.AbstractSolrEventListener;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.RefCounted;
+
+public class NumFieldsMonitor extends AbstractSolrEventListener {
+
+  private int numFields;
+
+  public NumFieldsMonitor(SolrCore core) {
+super(core);
+this.numFields = -1;
+  }
+
+  @Override
+  public void postCommit() {
+RefCounted indexSearcher = null;
+try {
+  indexSearcher = getCore().getSearcher();

Review Comment:
   Maybe; Java exception handling is such a PITA I didn't want to defeat 
the lambda convenience if the lambda threw an IOException which is rather 
common.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1589645861


##
solr/core/src/test-files/solr/configsets/cloud-minimal-field-limiting/conf/solrconfig.xml:
##
@@ -0,0 +1,73 @@
+
+
+
+
+
+
+
+
+  ${solr.data.dir:}
+
+  
+  
+
+  ${tests.luceneMatchVersion:LATEST}
+
+  
+
+  ${solr.commitwithin.softcommit:true}
+
+
+  
+
+  
+
+  explicit
+  true
+  text
+
+  
+
+  
+${solr.test.maxFields:1234}
+  

Review Comment:
   Having no-config vs config doesn't conceptually change where the 
 element can be placed.
   ex:
   ``
   vs
   ```
   
 
   
   ```
   Can be in the same places.
   I went to the ref guide's [page on 
URPs](https://solr.apache.org/guide/solr/latest/configuration-guide/update-request-processors.html#custom-update-request-processor-chain)
 and observe the first example shows a chain with several URPs, one of which 
has config.  I think it's natural to place URPs inline in the chain.  
Separating it out is for composability if there are multiple chains and URPs 
and you want to avoid possible repetition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1589189040


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  @Override
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);
+if (rawMaxFields == null || !(rawMaxFields instanceof Integer)) {
+  throw new IllegalArgumentException(
+  MAXIMUM_FIELDS_PARAM + " must be configured as a non-null ");
+}
+maximumFields = (Integer) rawMaxFields;
+if (maximumFields <= 0) {
+  throw new IllegalArgumentException(MAXIMUM_FIELDS_PARAM + " must be a 
positive integer");
+}
+  }
+
+  @Override
+  public UpdateRequestProcessor getInstance(
+  SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
+

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1589268556


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldsMonitor.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.core.AbstractSolrEventListener;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.RefCounted;
+
+public class NumFieldsMonitor extends AbstractSolrEventListener {
+
+  private int numFields;
+
+  public NumFieldsMonitor(SolrCore core) {
+super(core);
+this.numFields = -1;
+  }
+
+  @Override
+  public void postCommit() {
+RefCounted indexSearcher = null;
+try {
+  indexSearcher = getCore().getSearcher();

Review Comment:
   Is there a reason that `withSearcher` throws an IOException?  Would it be 
worth having one of those methods that consumes a vanilla Function instead of a 
`IOFunction`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2093110049

   Thanks for the re-review @dsmiley .  Pending feedback from others I'm going 
to stick with the NumFieldMonitor approach (more details 
[here](https://github.com/apache/solr/pull/2395#discussion_r1589189040)), but 
I've addressed the rest of your suggestions.  Lmk what you think!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-03 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1589215019


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldsMonitor.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.core.AbstractSolrEventListener;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.RefCounted;
+
+public class NumFieldsMonitor extends AbstractSolrEventListener {

Review Comment:
   > I suppose you didn't like my proposal that would eliminate the need for 
this event listener
   
   Yes, in short.  I agree it's a matter of opinion, but I prefer the 
particular complexity and benefits that NumFieldsMonitor offers to that of the 
direct-SIS or weak-reference approach.  See my reply [on the factory 
above](https://github.com/apache/solr/pull/2395/files#diff-6da5de1a5fc66295d066909ad976a15ad31e8411ed71d7909e8ed0d14f3cac78R103)
 for a bit more detail/explanation.



##
solr/core/src/test/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorIntegrationTest.java:
##
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+import org.apache.solr.client.solrj.request.CollectionAdminRequest;
+import org.apache.solr.cloud.SolrCloudTestCase;
+import org.apache.solr.common.SolrInputDocument;
+import org.hamcrest.Matchers;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class NumFieldLimitingUpdateRequestProcessorIntegrationTest extends 
SolrCloudTestCase {
+
+  private static String SINGLE_SHARD_COLL_NAME = "singleShardColl";

Review Comment:
   I think this is copy/paste from a test that used both single and multi-shard 
collections.  I've updated the name to be more generic.



##
solr/core/src/test-files/solr/configsets/cloud-minimal-field-limiting/conf/solrconfig.xml:
##
@@ -0,0 +1,73 @@
+
+
+
+
+
+
+
+
+  ${solr.data.dir:}
+
+  
+  
+
+  ${tests.luceneMatchVersion:LATEST}
+
+  
+
+  ${solr.commitwithin.softcommit:true}
+
+
+  
+
+  
+
+  explicit
+  true
+  text
+
+  
+
+  
+${solr.test.maxFields:1234}
+  

Review Comment:
   Tbh I wasn't aware that you could declare processors (that require config) 
directly *in* a processor chain.  I can't find an example of that.  Looking 
through several other solrconfig.xml files for URPs that require 
configuration...they all look similar to what I've done here afaict.
   
   (See the [_default configset Solr 
ships](https://github.com/apache/solr/blob/main/solr/server/solr/configsets/_default/conf/solrconfig.xml#L900-L944)
 and the [default configset in the 'core' modules 
tests](https://github.com/apache/solr/blob/main/solr/core/src/test-files/solr/configsets/_default/conf/solrconfig.xml#L61-L81)...both
 look similar to the example here)
   
   Maybe I'm misunderstanding you or missing something though - can you point 
to an example of how you'd prefer to see this declared, and a reason it's 
preferable? 



##
solr/solr-ref-guide/modules/configuration-guide/pages/update-request-processors.adoc:
##
@@ -337,6 +337,12 @@ Documents processed prior to the offender 

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-05-01 Thread via GitHub


dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1586827784


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessor.java:
##
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Locale;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NumFieldLimitingUpdateRequestProcessor extends 
UpdateRequestProcessor {

Review Comment:
   javadoc needed, even if one line linking to a factory for further info.
   Or don't make public :-)



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldsMonitor.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.core.AbstractSolrEventListener;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.RefCounted;
+
+public class NumFieldsMonitor extends AbstractSolrEventListener {
+
+  private int numFields;
+
+  public NumFieldsMonitor(SolrCore core) {
+super(core);
+this.numFields = -1;
+  }
+
+  @Override
+  public void postCommit() {
+RefCounted indexSearcher = null;
+try {
+  indexSearcher = getCore().getSearcher();

Review Comment:
   see SolrCore.withSearcher to make this code less involved.  I added that 
long ago and getSearcher's javadocs recommend using it.



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts 

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-30 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2086978057

   Alright, I've removed a lot of the leadership and shard checking logic based 
on review feedback.  I think this should be ready to go, unless you have 
remaining concerns @dsmiley ?
   
   I've also added the new URP to the URP-chain defined in the "_default" 
configset (`/solr/server/solr/configsets/_default`).  My intent here is that 
this part of the PR would be a "main" only change.  9.x would either remain 
without the URP in the _default configset, or have it enabled but in "warnOnly" 
mode (my personal preference).  I used a relatively high field limit, 1000, but 
we can raise it if folks think 1000 is too low?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-30 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1585300753


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessor.java:
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Locale;
+import org.apache.solr.cloud.CloudDescriptor;
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.common.cloud.Slice;
+import org.apache.solr.common.util.StrUtils;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NumFieldLimitingUpdateRequestProcessor extends 
UpdateRequestProcessor {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private SolrQueryRequest req;
+  private int fieldThreshold;
+  private int currentNumFields;
+  private boolean warnOnly;
+
+  public NumFieldLimitingUpdateRequestProcessor(
+  SolrQueryRequest req,
+  UpdateRequestProcessor next,
+  int fieldThreshold,
+  int currentNumFields,
+  boolean warnOnly) {
+super(next);
+this.req = req;
+this.fieldThreshold = fieldThreshold;
+this.currentNumFields = currentNumFields;
+this.warnOnly = warnOnly;
+  }
+
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+if (!isCloudMode() || /* implicit isCloudMode==true */ 
isLeaderThatOwnsTheDoc(cmd)) {

Review Comment:
   Yeah, I'm fine with moving away from that.
   
   In fact, rather than move the leader-check to the Factory, I think I'll omit 
it altogether.  I don't think it makes much sense if we're not checking the 
shard as well.  Though if you see a way for that to work, I'm happy to add it 
back in.
   



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this 

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-16 Thread via GitHub


dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1567989670


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessor.java:
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Locale;
+import org.apache.solr.cloud.CloudDescriptor;
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.common.cloud.Slice;
+import org.apache.solr.common.util.StrUtils;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NumFieldLimitingUpdateRequestProcessor extends 
UpdateRequestProcessor {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private SolrQueryRequest req;
+  private int fieldThreshold;
+  private int currentNumFields;
+  private boolean warnOnly;
+
+  public NumFieldLimitingUpdateRequestProcessor(
+  SolrQueryRequest req,
+  UpdateRequestProcessor next,
+  int fieldThreshold,
+  int currentNumFields,
+  boolean warnOnly) {
+super(next);
+this.req = req;
+this.fieldThreshold = fieldThreshold;
+this.currentNumFields = currentNumFields;
+this.warnOnly = warnOnly;
+  }
+
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+if (!isCloudMode() || /* implicit isCloudMode==true */ 
isLeaderThatOwnsTheDoc(cmd)) {

Review Comment:
   isCloudMode -- let's not; the user can configure or not as they wish.
   
   "isLeaderThatOwnsTheDoc" can also be removed.  Instead in the URP Factory, 
if we are not a leader, then skip this URP.  The "owns the doc concept" I think 
we are agreeing we don't need in this PR.



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * 

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-16 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2059268487

   > I don't like the complexity in this URP relating to tolerance of where the 
URP is placed in the chain; I'd feel better if the URP were simplified from 
that concern and we expect the user to place it at an appropriate spot
   
   Yeah I understand.  I still feel a little reluctant I guess, but I'm likely 
being paranoid.  It's harder to feel good about the "win" in making things 
safer for users if the change opens up another trappy state when misconfigured. 
 But I'll take out the leader-check for now.  We can always re-evaluate later 
if we see people getting bitten by this, and there are other ways to mitigate 
the risk of misconfig (e.g. putting into the "_default" config for 10.0 so 
users needn't tweak things).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-11 Thread via GitHub


dsmiley commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2050095834

   I don't like the complexity in this URP relating to tolerance of where the 
URP is placed in the chain; I'd feel better if the URP were simplified from 
that concern and we expect the user to place it at an appropriate spot.   We 
don't have complexity like that elsewhere and/or I argue it's not the right 
approach.
   
   I sympathize with why an URP feels right.  Okay.  On each `addDoc`, don't we 
just need to do the check on the current SolrIndexSearcher but remember who the 
searcher was so that the next `addDoc` can see it's the same searcher and if so 
don't do the check?  It'd need to cache the searcher reference for the instance 
equality; use a `WeakReference` so we allow the searcher to close & GC.  If we 
do this, we don't need a commit event listener, thus don't need an additional 
component / confusing interaction complexity.  Don't get the SolrIndexSearcher 
from the request (we don't want a per-request cache instance), get it from the 
SolrCore so we see a possibly evolving SolrIndexSearcher.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-11 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2049901831

   > Sorry if I rained on your parade, attempting to get this in by EOW.
   
   Oh no worries; there's no particular rush.  That was just how long I was 
planning to wait if no one commented in the interim.  I'm glad you both chimed 
in! 
   
   Though tbh I can't quite tell where you are on the PR approach overall:
   
   1. Are you '-1' this being an URP at all?
   2. Are you OK with this being an URP, with the minor tweaks you mentioned 
(e.g. documenting that it goes _after_ DUP, etc.)?
   3. Are you only OK with this being an URP, but only if we make some 
improvements to the URP framework (e.g. to ensure certain URPs always follow 
DUP)
   4. ...something else?
   
   Please let me know!
   
   One last reply inline:
   
   > To block all indexing, it could set SolrCore.readOnly = true and be done 
with it
   
   IMO `SolrCore.readOnly` is too blunt a tool for the "field limiting" this PR 
attempts.  First because `readOnly` would block deletions and operations that 
we'd want to allow.  And second because a user blocked by 'readOnly' would have 
their `/update` requests fail with a message that's pretty inscrutable or at 
least unrelated to the root cause.
   
   That's what I like about the URP approach - it gives us more granularity 
into the different types of "/update" operations and allows us to get a nice 
(or at least, nicer) error message back to users.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-11 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1560999237


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessor.java:
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Locale;
+import org.apache.solr.cloud.CloudDescriptor;
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.common.cloud.Slice;
+import org.apache.solr.common.util.StrUtils;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NumFieldLimitingUpdateRequestProcessor extends 
UpdateRequestProcessor {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private SolrQueryRequest req;
+  private int fieldThreshold;
+  private int currentNumFields;
+  private boolean warnOnly;
+
+  public NumFieldLimitingUpdateRequestProcessor(
+  SolrQueryRequest req,
+  UpdateRequestProcessor next,
+  int fieldThreshold,
+  int currentNumFields,
+  boolean warnOnly) {
+super(next);
+this.req = req;
+this.fieldThreshold = fieldThreshold;
+this.currentNumFields = currentNumFields;
+this.warnOnly = warnOnly;
+  }
+
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+if (!isCloudMode() || /* implicit isCloudMode==true */ 
isLeaderThatOwnsTheDoc(cmd)) {
+  if (coreExceedsFieldLimit()) {
+throwExceptionOrLog(cmd.getPrintableId());
+  } else {
+if (log.isDebugEnabled()) {
+  log.debug(
+  "Allowing document {}, since current core is under the max-field 
limit (numFields={}, maxThreshold={})",

Review Comment:
   Done  



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldsMonitor.java:
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.core.AbstractSolrEventListener;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.RefCounted;
+
+public class NumFieldsMonitor extends AbstractSolrEventListener {
+
+  private int numFields;
+
+  public NumFieldsMonitor(SolrCore core) {
+super(core);
+this.numFields = -1;
+  }
+
+  @Override
+  public void postCommit() {
+RefCounted indexSearcher = null;
+try {
+  indexSearcher = getCore().getSearcher();
+  // Get the number of fields directly from the IndexReader instead of the 
Schema object to also
+  // include the

Review Comment:
   > formatting?
   
   Done  
   
   > are there any other interesting places in Solr to use a NumFieldsMonitor?
   
   Good question.  It does seem like a kindof neat thing to have around, but I 
can't think of any other use-cases off the top of my head...



##
solr/core/src/test/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactoryTest.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license 

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-11 Thread via GitHub


gerlowskija commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1561009284


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ * significantly from the number of fields defined in managed-schema.xml - 
especially when dynamic
+ * fields are enabled. The only way to reduce this field count is to delete 
documents and wait until
+ * the deleted documents have been removed by segment merges. Users may of 
course speed up this
+ * process by tweaking Solr's segment-merging, triggering an "optimize" 
operation, etc.
+ *
+ * {@link NumFieldLimitingUpdateRequestProcessorFactory} accepts two 
configuration parameters:
+ *
+ * 
+ *   maxFields - (required) The maximum number of fields 
before update requests
+ *   should be aborted. Once this limit has been exceeded, additional 
update requests will fail
+ *   until fields have been removed or the "maxFields" is increased.
+ *   warnOnly - (optional) If true then the URP 
logs verbose warnings
+ *   about the limit being exceeded but doesn't abort update requests. 
Defaults to false
+ *if not specified
+ * 
+ *
+ * @since 9.6.0
+ */
+public class NumFieldLimitingUpdateRequestProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final String MAXIMUM_FIELDS_PARAM = "maxFields";
+  private static final String WARN_ONLY_PARAM = "warnOnly";
+
+  private NumFieldsMonitor numFieldsMonitor;
+  private int maximumFields;
+  private boolean warnOnly;
+
+  @Override
+  public void inform(final SolrCore core) {
+// register a commit callback for monitoring the number of fields in the 
schema
+numFieldsMonitor = new NumFieldsMonitor(core);
+core.getUpdateHandler().registerCommitCallback(numFieldsMonitor);
+core.registerNewSearcherListener(numFieldsMonitor);
+  }
+
+  public void init(NamedList args) {
+warnOnly = args.indexOf(WARN_ONLY_PARAM, 0) > 0 ? 
args.getBooleanArg(WARN_ONLY_PARAM) : false;
+
+if (args.indexOf(MAXIMUM_FIELDS_PARAM, 0) < 0) {
+  throw new IllegalArgumentException(
+  "The "
+  + MAXIMUM_FIELDS_PARAM
+  + " parameter is required for "
+  + getClass().getName()
+  + ", but no value was provided.");
+}
+final Object rawMaxFields = args.get(MAXIMUM_FIELDS_PARAM);

Review Comment:
   Agreed - it's really unfortunate.  I looked at a number of URP's and 
NamedListInitializedPlugin's, but couldn't find anything I liked.
   
   Maybe we could file a ticket for that separately; it'd be a really nice 
cleanup for all our NLIP's to use a set of utilities like:
   
   ```
   int getRequiredNamedListValueAsInteger(NamedList nl, String paramName);
   Integer getNamedListValueAsInteger(NamedList nl, String paramName);
   boolean getRequiredNamedListValueAsBoolean(NamedList nl, String paramName);
   

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-10 Thread via GitHub


dsmiley commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2048498024

   Sorry if I rained on your parade, attempting to get this in by EOW.
   
   I like the idea of having checks like this by default but I'd like Solr to 
have such built-in plugins simply be registered automatically unless you set 
some env/sys-prop to disable, like 
"solr.core.eventListener.fieldLimit.enabled".  We already register a ton of 
query parsers even though nobody asked for them :-). I think that should be the 
model for built-in plugins so that we can continue slowly to reduce what's in a 
solrconfig.xml yet also allow someone to disable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-10 Thread via GitHub


epugh commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1559751361


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldsMonitor.java:
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.core.AbstractSolrEventListener;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.RefCounted;
+
+public class NumFieldsMonitor extends AbstractSolrEventListener {
+
+  private int numFields;
+
+  public NumFieldsMonitor(SolrCore core) {
+super(core);
+this.numFields = -1;
+  }
+
+  @Override
+  public void postCommit() {
+RefCounted indexSearcher = null;
+try {
+  indexSearcher = getCore().getSearcher();
+  // Get the number of fields directly from the IndexReader instead of the 
Schema object to also
+  // include the

Review Comment:
   are there any other interesting places in Solr to use a NumFieldsMonitor?   



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessor.java:
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Locale;
+import org.apache.solr.cloud.CloudDescriptor;
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.common.cloud.Slice;
+import org.apache.solr.common.util.StrUtils;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NumFieldLimitingUpdateRequestProcessor extends 
UpdateRequestProcessor {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private SolrQueryRequest req;
+  private int fieldThreshold;
+  private int currentNumFields;
+  private boolean warnOnly;
+
+  public NumFieldLimitingUpdateRequestProcessor(
+  SolrQueryRequest req,
+  UpdateRequestProcessor next,
+  int fieldThreshold,
+  int currentNumFields,
+  boolean warnOnly) {
+super(next);
+this.req = req;
+this.fieldThreshold = fieldThreshold;
+this.currentNumFields = currentNumFields;
+this.warnOnly = warnOnly;
+  }
+
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+if (!isCloudMode() || /* implicit isCloudMode==true */ 
isLeaderThatOwnsTheDoc(cmd)) {
+  if (coreExceedsFieldLimit()) {
+throwExceptionOrLog(cmd.getPrintableId());
+  } else {
+if (log.isDebugEnabled()) {
+  log.debug(
+  "Allowing document {}, since current core is under the max-field 
limit (numFields={}, maxThreshold={})",

Review Comment:
   max-field or maxField?



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the 

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-10 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2047973430

   This should all be ready to go: I've added unit and integration tests, 
`./gradlew test` passes, there's ref-guide coverage and decent Javadocs, etc.  
I'll aim to merge EOW, absent any feedback folks might have here.
   
   One big question I have is: should we add this into the "_default" 
configset's default URP-chain?  At least for 10.x?
   
   I'd argue we should.  Safety guardrails like this one aren't any good unless 
users enable them, and having it be enabled by default is the best way to do 
that.  We might get some complaints from folks who run into this during an 
inplace 9->10 upgrade, but:
   
   1. We tell users to carefully reconsider their configsets when doing 
upgrades, so this shouldn't be a surprise.
   2. We can highlight it clearly in the upgrade notes
   3. Users who hit this on an in-place upgrade can always bump the limit, 
toggle the URP into 'warnOnly' mode, or disable the URP altogether if they 
really want.
   
   I'd love a sanity-check/+1 from someone else though before I make this 
enabled by default on `main`/10.x, so please chime in!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-09 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2045508653

   In a sample solrconfig.xml, configuration for the proposed URP would look 
like:
   
   ```
 
   123
 
 ...
 
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-09 Thread via GitHub


gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2045498246

   Still TBD - additional tests, ref-guide docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-09 Thread via GitHub


gerlowskija opened a new pull request, #2395:
URL: https://github.com/apache/solr/pull/2395

   https://issues.apache.org/jira/browse/SOLR-17192
   
   # Description
   
   Solr isn't infinitely scalable when it comes to the number of fields in each 
core/collection. Most deployments start to experience problems any time a core 
has upwards of a few hundred fields. Usually this doesn't exhibit itself right 
away. instead waiting until segment-merge or some other time to rear its head.
   
   Despite this being a known limitation Solr doesn't have any (active) way of 
helping users avoid this, excepting one or two references buried in the Solr 
ref-gudie. 
   
   # Solution
   
   This commit adds a new UpdateRequestProcessor, 
`NumFieldsLimitingUpdateRequestProcessor`, that flags potentially-dangerous 
schema design for users by failing all updates once the relevant core has 
exceeded a configurable limit.  The proposed URP has two configuration 
properties:
   
   - `maxFields` - (required) a positive integer value representing the maximum 
number of fields a core should be allowed to have.
   - `warnOnly` - (optional, defaults to 'false' if not specified) a boolean 
flag indicating whether Solr should "really" fail requests once the threshold 
has been hit, or should merely log a warning instead.
   
   # Tests
   
   Unit tests in `NumFieldsLimitingUpdateRequestProcessorFactoryTest`.  Still 
needs some higher level testing prior to merge.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org