[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory
[ https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-8280: --- Attachment: SOLR-8280.patch New in this patch... * cleanup & beef up nocommit comments to point to new SOLR-8311 trakcing jira * beefed up ChangedSchemaMergeTest to actually change the sim used in each schema & verify it's updated (and fully functional) * put some sanity checks in TestBulkSchemaAPI.testMultipleCommands ** already had some basic verification that adding a fieldtype w/sim + field using that type workd ** now it whitebox verifies that the the underlying SimilarityFactory for the latest schema is and returns the expected Sim for each field. ...still testing, but i think this is good to go. > TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use > SolrCoreAware sim factory: SchemaSimilarityFactory > --- > > Key: SOLR-8280 > URL: https://issues.apache.org/jira/browse/SOLR-8280 > Project: Solr > Issue Type: Bug >Affects Versions: Trunk >Reporter: Hoss Man > Attachments: SOLR-8280.patch, SOLR-8280.patch, SOLR-8280.patch, > SOLR-8280.patch, SOLR-8280__broken__resource_loader_experiment.patch > > > Something about the code path(s) involved in TestCloudSchemaless & > ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is > SolrCoreAware -- notably: SchemaSimilarityFactory. > I discovered this while trying to implement SOLR-8271, but it can be > reproduced trivially by modifying the > schema-add-schema-fields-update-processor.xml file used by > TestCloudSchemaless (and hardcoded in java schema used by > ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly. > Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: > TestCloudManagedSchema) tests don't seem to demonstrate the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory
[ https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-8280: --- Attachment: SOLR-8280.patch bq. I have a half-implemented patch hanging around somewhere that tried to clean this up a bit. I think the root problem is that there are two circumstances in which we're using SolrResourceLoader, ... Agreed. It's a mess, and it would be nice to clean up -- but that's a huge pile of work, so i'd prefer to punt it to another issue. (which i will file soon) After digging into things a bit more, here are a few things i learned/realized/uncovered in no particular order... * in most cases, the schemaless/managed-schema code paths don't actaully replace the entire IndexSchema used by a SolrCore... ** REST APIs use things like {{ManagedIndexSchema.addFieldTypes(...)}} which do shallow copies *** these ManagedIndexSchema methods for mutating things and doing shallow copies already have smarts to ensure that any new {{ResourceAware}} objects get properly informed. *** since there is no way to dynamically change the SimFactory at run time, the existing instance is re-used in all of these shallow copies and no new SimFactory instances ever need informed of the core ** some (cloud specific) code paths use things like {{ZkIndexSchemaReader.updateSchema}} to notice if/when the schema file changes in ZK and act on that locally *** this does (evidently) construct an entirely new ManagedIndexSchema instance *** this is the code path that was execing after {{TestCloudSchemaless}} was finished -- but I still understand when/why this was happening. ** ChangedSchemaMergeTest is kind of a special case, because it goes out of it's way to construct a new IndexSchema and set it on an existing SolrCore even though it isn't using MangedIndexSchema * There was already a special kludge for SolrCoreAware SimFactories in {{SolrCore.initSchema}} ** looks like this was originally for ensuring that the SimFactories was usable when other SolrCoreAware things (like listeners) got informed of the SolrCore and tried to use the SolrIndexSearcher (which depended on the sim) So i think the most straight forward solution to the problem (SimilarityFactory-ies that implement SolrCoreAware playing nice with managed schema) is to refactor that existing kludge from {{SolrCore.initSchema}} to {{SolrCore.setLatestSchema}} Current Patch... * schema-add-schema-fields-update-processor.xml - explicitly use SchemaSimilarityFactory here to help stress TestCloudSchemaless * ChangedSchemaMergeTest - explicitly use SchemaSimilarityFactory here to test that scenerio * SolrCore - refactored existing SolrCoreAware simfactory hack so that it applies anytime setLatestSchema is called * SchemaSimilarityFactory - switched from assertions to IllegalStateException so it's more obvious there's a problem even if assertions are disabled (no NPE) * SolrResourceLoader - has some nocommits i want to update with strong warnings and a link to a new jira where my & alan's comments about the lifecycle problems of objects inited _after_ the SolrCore is loaded are tracked TODO: * cleanup & beef up nocommit comments * beef up ChangedSchemaMergeTest to actually change the sim used in each schema & verify it's updated (and fully functional) * add/update a managed schema test that does an add-field type w/ a per-fieldtype sim and sanity check that code path + input works properly and plays nicely with SchemaSimilarityFactory > TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use > SolrCoreAware sim factory: SchemaSimilarityFactory > --- > > Key: SOLR-8280 > URL: https://issues.apache.org/jira/browse/SOLR-8280 > Project: Solr > Issue Type: Bug >Affects Versions: Trunk >Reporter: Hoss Man > Attachments: SOLR-8280.patch, SOLR-8280.patch, SOLR-8280.patch, > SOLR-8280__broken__resource_loader_experiment.patch > > > Something about the code path(s) involved in TestCloudSchemaless & > ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is > SolrCoreAware -- notably: SchemaSimilarityFactory. > I discovered this while trying to implement SOLR-8271, but it can be > reproduced trivially by modifying the > schema-add-schema-fields-update-processor.xml file used by > TestCloudSchemaless (and hardcoded in java schema used by > ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly. > Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: > TestCloudManagedSchema) tests don't seem to demonstrate the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) -
[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory
[ https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-8280: --- Attachment: SOLR-8280__broken__resource_loader_experiment.patch The root problem seems to be that when using the SolrResourceLoader to create newInstances of objects, the loader is tracking what things are SolrCoreAware, ResourceLoaderAware, and/or SolrInfoMBean. Then, just before the SolrCore finishes initialiing itself, it calls a method on SolrResourceLoader to take appropriate action on to inform those instances (and/or add them to the MBean registry) The problem happens when any new instances are created by the SolrResourceLoader _after_ the SolrCore is up and running -- it currently has a {{live}} boolean it uses to just flat out ignore wether or not these instances are SolrCoreAware, ResourceLoaderAware, and/or SolrInfoMBean, meaning that nothing in the call stack ever informs them about the SolrCore. It looks like SOLR-4658 included a bit of a hack work arround for the ResourceLoaderAware schema elements (see IndexSchema's constructor which call's {{loader.inform(loader);}}... http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/IndexSchema.java?r1=1463182&r2=1463181&pathrev=1463182 ...this seems realy sketchy because it causes *any* ResourceLoaderAware plugins inited so far by the core to be {{inform(ResourceLoader)}}ed once the first IndexSchema is created -- even though that's not suppose to happen until mutch later in the SolrCore constructor just before the CountDownLatch is released. What it does do however is ensure that when a new schema gets loaded later (by the REST API, or a schemaless update processor) and ResourceLoaderAware fieldtypes/analyzers are good to go -- but that doesn't do anything to help SolrCoreAware plugins like SimilarityFactory. I'm attaching a work in progress patch where I attempted to fix the underlying problem with SolrResourceLoader by having it keep a refrence to the SolrCore it's tied to such that any new instances after that the would be immediately informed of the SorlCore/ResourceLoader. This fixes some of the tests I mentioned before in this issue that have problems with SchemaSimilarityFactory but causes other failures in other existing test that reload the schema -- because any FieldType that is ResourceLoader aware is now being "informed" of the loader as soon as it's instantiated -- before even basic init() methods are called. Which makes sense in hind sight -- my whole approach here is flawed because the contract is suppoes to be that the init methods will always be called first, and any (valid) inform methods will be called at some point after that once the core/loader is available, but before the instance is used ... calling "new" then "inform" then "init" is maddness. I honestly don't know if there is a sane way to solve this problem in the general case -- the best thing I can think of at the moment is a similar special case hack for calling {{loader.inform(SolrCore)}} after any code that creates a schema (other then SolrCore) > TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use > SolrCoreAware sim factory: SchemaSimilarityFactory > --- > > Key: SOLR-8280 > URL: https://issues.apache.org/jira/browse/SOLR-8280 > Project: Solr > Issue Type: Bug >Affects Versions: Trunk >Reporter: Hoss Man > Attachments: SOLR-8280.patch, SOLR-8280.patch, > SOLR-8280__broken__resource_loader_experiment.patch > > > Something about the code path(s) involved in TestCloudSchemaless & > ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is > SolrCoreAware -- notably: SchemaSimilarityFactory. > I discovered this while trying to implement SOLR-8271, but it can be > reproduced trivially by modifying the > schema-add-schema-fields-update-processor.xml file used by > TestCloudSchemaless (and hardcoded in java schema used by > ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly. > Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: > TestCloudManagedSchema) tests don't seem to demonstrate the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory
[ https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-8280: --- Description: Something about the code path(s) involved in TestCloudSchemaless & ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is SolrCoreAware -- notably: SchemaSimilarityFactory. I discovered this while trying to implement SOLR-8271, but it can be reproduced trivially by modifying the schema-add-schema-fields-update-processor.xml file used by TestCloudSchemaless (and hardcoded in java schema used by ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly. Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: TestCloudManagedSchema) tests don't seem to demonstrate the same problem. was: Something about the code path(s) involved in TestCloudSchemaless doesn't play nicely with a SimilarityFactory that is SolrCoreAware -- notably: SchemaSimilarityFactory. I discovered this while trying to implement SOLR-8271, but it can be reproduced trivially by modifying the schema-add-schema-fields-update-processor.xml file used by TestCloudSchemaless to refer to SchemaSimilarityFactory explicitly. Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: TestCloudManagedSchema) tests don't seem to demonstrate the same problem. Summary: TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory (was: TestCloudSchemaless fails weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory ) > TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use > SolrCoreAware sim factory: SchemaSimilarityFactory > --- > > Key: SOLR-8280 > URL: https://issues.apache.org/jira/browse/SOLR-8280 > Project: Solr > Issue Type: Bug >Affects Versions: Trunk >Reporter: Hoss Man > Attachments: SOLR-8280.patch, SOLR-8280.patch > > > Something about the code path(s) involved in TestCloudSchemaless & > ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is > SolrCoreAware -- notably: SchemaSimilarityFactory. > I discovered this while trying to implement SOLR-8271, but it can be > reproduced trivially by modifying the > schema-add-schema-fields-update-processor.xml file used by > TestCloudSchemaless (and hardcoded in java schema used by > ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly. > Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: > TestCloudManagedSchema) tests don't seem to demonstrate the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org