[ https://issues.apache.org/jira/browse/SOLR-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Rowe updated SOLR-9526: ----------------------------- Attachment: SOLR-9526.patch Attaching patch brought up to date with master (in particular, collapsing of {{data_driven_schema_configs}} and {{basic_configs}} into {{_default}}) - note that your original patch only modified {{solrconfig.xml}} on one of these and {{managed_schema}} on the other - I assume you had/have local changes that didn't make it into the patch [~janhoy]? I made a couple of other changes; details below. {quote} See new NOCOMMIT comments. I was using the ManagedIndexSchema method {code} public ManagedIndexSchema addCopyFields(String source, Collection<String> destinations, int maxChars) {code} which does not have a {{persist=true/false}} argument, so calling it leaves the schema not persisted. Then I could not find a way to explicitly persist it since method {{boolean persistManagedSchema(boolean createOnly)}} was not public. In this patch I've made it public and done a hacky instanceof check in AddSchemaFieldsUpdateProcessorFactory {code} if (newSchema instanceof ManagedIndexSchema) { // NOCOMMIT: Hack to avoid persisting schema once after addFields and then once after each copyField ((ManagedIndexSchema)newSchema).persistManagedSchema(false); } {code} Steve Rowe, you wrote the {{addCopyFields()}} method a while ago, is there a cleaner way to make sure schema is persisted after adding a copyField? {quote} The design of {{ManagedIndexSchema}}'s API was in support of the Schema REST API, where each resource was modifiable one at a time; "bulk" modifications weren't possible. In the new bulk schema API, though, the ordinary case involves multiple modifications; in this case, it is counter-productive to persist in the middle of a set of operations. SOLR-6476 (introducing schema "bulk" mode) added the option to *not* persist the schema after an operation; previously every operation was automatically persisted. This was added as an option because at the time, bulk and REST modes co-existed. SOLR-7682 added the ability to specify maxChars for copyField directives, and I intentionally left off the {{persist}} option of the new {{addCopyFields()}} method, because there was (intentionally) no way to invoke this capability via the (now deprecated) schema REST API, and the bulk schema API didn't need the {{persist}} option. Long story short: I think making {{persistManagedSchema()}} public is a natural consequence of the bulk schema API (and in support of bulk operations from other sources, e.g. this issue). It's just that nobody had gotten around to it yet. In the {{AddSchemaFieldsUpdateProcessorFactory.processAdd()}} in my patch I removed the {{instanceof ManagedIndexSchema}} check wrapping the call to {{persistManagedSchama()}}, as well as the {{NOCOMMIT}}'s, since the check {{if ( ! cmd.getReq().getSchema().isMutable())}} at the beginning of the method already insures that we're dealing with a {{ManagedIndexSchema}}. I also removed the following {{typeMapping}} that was added in your patch from URP chains {{add-fields-no-run-processor}} and {{parse-and-add-fields}} in {{solrconfig-add-schema-fields-update-processor-chains.xml}} - I'm assuming this is a vestige from an earlier concept of removing {{<defaultTypeMapping>}}, since these chains have {{<str name="defaultFieldType">text</str>}}? {{AddSchemaFieldsUpdateProcessorFactoryTest}} passes with my change: {code:xml} <lst name="typeMapping"> <str name="valueClass">java.lang.String</str> <str name="fieldType">text</str> </lst> {code} > data_driven configs defaults to "strings" for unmapped fields, makes most > fields containing "textual content" unsearchable, breaks tutorial examples > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-9526 > URL: https://issues.apache.org/jira/browse/SOLR-9526 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: UpdateRequestProcessors > Reporter: Hoss Man > Assignee: Jan Høydahl > Labels: dynamic-schema > Fix For: 7.0 > > Attachments: SOLR-9526.patch, SOLR-9526.patch, SOLR-9526.patch, > SOLR-9526.patch, SOLR-9526.patch > > > James Pritchett pointed out on the solr-user list that this sample query from > the quick start tutorial matched no docs (even though the tutorial text says > "The above request returns only one document")... > http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=name:foundation > The root problem seems to be that the add-unknown-fields-to-the-schema chain > in data_driven_schema_configs is configured with... > {code} > <str name="defaultFieldType">strings</str> > {code} > ...and the "strings" type uses StrField and is not tokenized. > ---- > Original thread: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201609.mbox/%3ccac-n2zrpsspfnk43agecspchc5b-0ff25xlfnzogyuvyg2d...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org