[ https://issues.apache.org/jira/browse/SOLR-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16049300#comment-16049300 ]
Erick Erickson commented on SOLR-10574: --------------------------------------- I'll add a yes to managed schema having an xml extension. Agree make it a separate issue. Catch-all _text_ field: yes. Enabled by default: yes with warning. Since this is not for production anyway, might as well make it as easy as possible to get started. If we're going to enable data_driven, we should have a catch-all field enabled by default. Neither one is something I'd recommend going to production with without close examination. So to me it's a "both or neither" preference. The point of having data_driven as the default is to lower first-time barriers to entry. If the catch-all field is there and it's the pre-configured "df" for the request handlers people get results the first time they index and search without even knowing they have fields in their documents. Otherwise they're left scratching their heads because they indexed stuff but didn't find anything. So we'd then tell them "Examine your index to see what fields were actually defined, and do fielded search ('cause they don't even necessarily know what the docs look like!). Or enable a catch-all field and re-index", which is a minimal improvement in first-time experience over what we have now, at least they were able to index docs if not successfully search them the first time they tried. Perhaps the warning (in the schema file and in startup guides or maybe "taking Solr to production") is something akin to "add-unknown-fields-to-the-schema and the default behavior of copying all fields to _text_ are options intended for getting started. Production systems rarely enable either of these two options. See solrconfig.xml and managed-schema(.xml) for the text 'RARELY ENABLED FOR PRODUCTION' ". Or something like that. > Choose a default configset for Solr 7 > ------------------------------------- > > Key: SOLR-10574 > URL: https://issues.apache.org/jira/browse/SOLR-10574 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Ishan Chattopadhyaya > Assignee: Ishan Chattopadhyaya > Priority: Blocker > Fix For: master (7.0) > > Attachments: SOLR-10574.patch, SOLR-10574.patch, SOLR-10574.patch > > > Currently, the data_driven_schema_configs is the default configset when > collections are created using the bin/solr script and no configset is > specified. > However, that may not be the best choice. We need to decide which is the best > choice, out of the box, considering many users might create collections > without knowing about the concept of a configset going forward. > (See also SOLR-10272) > Proposed changes: > # Remove data_driven_schema_configs and basic_configs > # Introduce a combined configset, {{_default}} based on the above two > configsets. > # Build a "toggleable" data driven functionality into {{_default}} > Usage: > # Create a collection (using _default configset) > # Data driven / schemaless functionality is enabled by default; so just start > indexing your documents. > # If don't want data driven / schemaless, disable this behaviour: {code} > curl http://host:8983/solr/coll1/config -d '{"set-user-property": > {"update.autoCreateFields":"false"}}' > {code} > # Create schema fields using schema API, and index documents -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org