[ https://issues.apache.org/jira/browse/SOLR-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ishan Chattopadhyaya updated SOLR-10574: ---------------------------------------- Attachment: SOLR-10574.patch Apologies and a bit of an update on my radio silence. I had offline discussions with [~noblepaul], [~hossman], [~shalinmangar]. There were various approaches that I was considering: # The initParams based enabling/disabling mechanism for data driven nature. Discarded this, considering Noble's concerns that initParams with globbing/wildcards support is a risky tool for user to shoot himself on the foot (if he gets the wildcards wrong), and hence it is a possibility that we may want to remove initParams support going forward. # Trying to create the chain programmatically was not easy, since the AddSchemaFieldsUpdateProcessorFactory needs field type names as defined in the managed-schema/schema.xml. Hence, if the chain is created programmatically, the user would not be able to switch them to point fields instead of trie fields or vice versa for example. # Letting the user enable/disable the data driven nature by adding "update.chain=add-unknown-fields-to-the-schema" to every paramset in ImplicitPlugins.json and then letting the user use the config API to update the "update.chain" parameter's value for enabling/disabling. This approach exposed too much of the internals like "update chain" and the name of the chain etc. in the command to enable/disable data driven nature and hence potentially confusing. A very important consideration in setting up this enable/disable data driven feature was that if we are going to use the "add-unknown-fields-to-schema" update chain exactly as it is defined in data-driven-schema-configs as of today, then it would be impossible for the user to modify the update chain (or parts of the chain) using the config API, as the config API cannot edit URPs that are within an update chain, and also it doesn't support creating/editing update chains. So, the solution (as in the patch) was to break out the individual URPs in the add-unknown-fields-to-the-schema chain into top level named URPs (hence they would be editable using config APIs) and creating a chain using those named URPs that is functionally similar. There is a nice, not well documented, default=true|false attribute for update chains that has been (and should have been all along) used to enable/disable the data driven nature (based on a variable). So, *TLDR*; check out the new {{_default}} configset in the patch. It has data driven nature enabled by default. The data driven nature can be enabled/disabled using the following: {code} Disable schemaless/data driven nature: curl http://localhost:8983/solr/mycollection/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}' Enable schemaless/data driven nature: curl http://localhost:8983/solr/mycollection/config -d '{"set-user-property": {"update.autoCreateFields":"true"}}' {code} Would appreciate a review. Note: the patch contains only the new default configset. However, we also need to remove the existing data_driven_schema_configs and basic_configs and update the script. Also, I haven't consolidated the managed-schema differences between basic_configs and data_driven_schema_configs into this {{_default}} configset yet. > Choose a default configset for Solr 7 > ------------------------------------- > > Key: SOLR-10574 > URL: https://issues.apache.org/jira/browse/SOLR-10574 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Ishan Chattopadhyaya > Assignee: Ishan Chattopadhyaya > Priority: Blocker > Fix For: master (7.0) > > Attachments: SOLR-10574.patch > > > Currently, the data_driven_schema_configs is the default configset when > collections are created using the bin/solr script and no configset is > specified. > However, that may not be the best choice. We need to decide which is the best > choice, out of the box, considering many users might create collections > without knowing about the concept of a configset going forward. > (See also SOLR-10272) > Proposed changes: > # Lets deprecate what we know as data_driven_schema_configs > # Build a "toggleable" data driven functionality into the basic_configs > configset (and make it the default) -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org