[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory

2015-11-18 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-8280:
---
Attachment: SOLR-8280.patch


New in this patch...

* cleanup & beef up nocommit comments to point to new SOLR-8311 trakcing jira
* beefed up ChangedSchemaMergeTest to actually change the sim used in each 
schema & verify it's updated (and fully functional)
* put some sanity checks in TestBulkSchemaAPI.testMultipleCommands
** already had some basic verification that adding a fieldtype w/sim + field 
using that type workd
** now it whitebox verifies that the the underlying SimilarityFactory for the 
latest schema is  and returns the expected Sim for each field.


...still testing, but i think this is good to go.

> TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use 
> SolrCoreAware sim factory: SchemaSimilarityFactory 
> ---
>
> Key: SOLR-8280
> URL: https://issues.apache.org/jira/browse/SOLR-8280
> Project: Solr
>  Issue Type: Bug
>Affects Versions: Trunk
>Reporter: Hoss Man
> Attachments: SOLR-8280.patch, SOLR-8280.patch, SOLR-8280.patch, 
> SOLR-8280.patch, SOLR-8280__broken__resource_loader_experiment.patch
>
>
> Something about the code path(s) involved in TestCloudSchemaless & 
> ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is 
> SolrCoreAware -- notably: SchemaSimilarityFactory.
> I discovered this while trying to implement SOLR-8271, but it can be 
> reproduced trivially by modifying the 
> schema-add-schema-fields-update-processor.xml file used by 
> TestCloudSchemaless (and hardcoded in java schema used by 
> ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly.  
> Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: 
> TestCloudManagedSchema) tests don't seem to demonstrate the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory

2015-11-17 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-8280:
---
Attachment: SOLR-8280.patch


bq. I have a half-implemented patch hanging around somewhere that tried to 
clean this up a bit. I think the root problem is that there are two 
circumstances in which we're using SolrResourceLoader, ...

Agreed.  It's a mess, and it would be nice to clean up -- but that's a huge 
pile of work, so i'd prefer to punt it to another issue. (which i will file 
soon)



After digging into things a bit more, here are a few things i 
learned/realized/uncovered in no particular order...

* in most cases, the schemaless/managed-schema code paths don't actaully 
replace the entire IndexSchema used by a SolrCore...
** REST APIs use things like {{ManagedIndexSchema.addFieldTypes(...)}} which do 
shallow copies
*** these ManagedIndexSchema methods for mutating things and doing shallow 
copies already have smarts to ensure that any new {{ResourceAware}} objects get 
properly informed.
*** since there is no way to dynamically change the SimFactory at run time, the 
existing instance is re-used in all of these shallow copies and no new 
SimFactory instances ever need informed of the core
** some (cloud specific) code paths use things like 
{{ZkIndexSchemaReader.updateSchema}} to notice if/when the schema file changes 
in ZK and act on that locally
*** this does (evidently) construct an entirely new ManagedIndexSchema instance
*** this is the code path that was execing after {{TestCloudSchemaless}} was 
finished -- but I still understand when/why this was happening.
** ChangedSchemaMergeTest is kind of a special case, because it goes out of 
it's way to construct a new IndexSchema and set it on an existing SolrCore even 
though it isn't using MangedIndexSchema
* There was already a special kludge for SolrCoreAware SimFactories in 
{{SolrCore.initSchema}}
** looks like this was originally for ensuring that the SimFactories was usable 
when other SolrCoreAware things (like listeners) got informed of the SolrCore 
and tried to use the SolrIndexSearcher (which depended on the sim)

So i think the most straight forward solution to the problem 
(SimilarityFactory-ies that implement SolrCoreAware playing nice with managed 
schema) is to refactor that existing kludge from {{SolrCore.initSchema}} to 
{{SolrCore.setLatestSchema}}



Current Patch...

* schema-add-schema-fields-update-processor.xml - explicitly use 
SchemaSimilarityFactory here to help stress  TestCloudSchemaless
* ChangedSchemaMergeTest - explicitly use SchemaSimilarityFactory here to test 
that scenerio
* SolrCore - refactored existing SolrCoreAware simfactory hack so that it 
applies anytime setLatestSchema is called
* SchemaSimilarityFactory - switched from assertions to IllegalStateException 
so it's more obvious there's a problem even if assertions are disabled (no NPE)
* SolrResourceLoader - has some nocommits i want to update with strong warnings 
and a link to a new jira where my & alan's comments about the lifecycle 
problems of objects inited _after_ the SolrCore is loaded are tracked



TODO:
* cleanup & beef up nocommit comments
* beef up ChangedSchemaMergeTest to actually change the sim used in each schema 
& verify it's updated (and fully functional) 
* add/update a managed schema test that does an add-field type w/ a 
per-fieldtype sim and sanity check that code path + input works properly and 
plays nicely with SchemaSimilarityFactory




> TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use 
> SolrCoreAware sim factory: SchemaSimilarityFactory 
> ---
>
> Key: SOLR-8280
> URL: https://issues.apache.org/jira/browse/SOLR-8280
> Project: Solr
>  Issue Type: Bug
>Affects Versions: Trunk
>Reporter: Hoss Man
> Attachments: SOLR-8280.patch, SOLR-8280.patch, SOLR-8280.patch, 
> SOLR-8280__broken__resource_loader_experiment.patch
>
>
> Something about the code path(s) involved in TestCloudSchemaless & 
> ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is 
> SolrCoreAware -- notably: SchemaSimilarityFactory.
> I discovered this while trying to implement SOLR-8271, but it can be 
> reproduced trivially by modifying the 
> schema-add-schema-fields-update-processor.xml file used by 
> TestCloudSchemaless (and hardcoded in java schema used by 
> ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly.  
> Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: 
> TestCloudManagedSchema) tests don't seem to demonstrate the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-

[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory

2015-11-15 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-8280:
---
Attachment: SOLR-8280__broken__resource_loader_experiment.patch


The root problem seems to be that when using the SolrResourceLoader to create 
newInstances of objects, the loader is tracking what things are SolrCoreAware, 
ResourceLoaderAware, and/or SolrInfoMBean.  Then, just before the SolrCore 
finishes initialiing itself, it calls a method on SolrResourceLoader to take 
appropriate action on to inform those instances (and/or add them to the MBean 
registry)

The problem happens when any new instances are created by the 
SolrResourceLoader _after_ the SolrCore is up and running -- it currently has a 
{{live}} boolean it uses to just flat out ignore wether or not these instances 
are SolrCoreAware, ResourceLoaderAware, and/or SolrInfoMBean, meaning that 
nothing in the call stack ever informs them about the SolrCore.

It looks like SOLR-4658 included a bit of a hack work arround for the 
ResourceLoaderAware schema elements (see IndexSchema's constructor which call's 
{{loader.inform(loader);}}...

http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/IndexSchema.java?r1=1463182&r2=1463181&pathrev=1463182

...this seems realy sketchy because it causes *any* ResourceLoaderAware plugins 
inited so far by the core to be {{inform(ResourceLoader)}}ed once the first 
IndexSchema is created -- even though that's not suppose to happen until mutch 
later in the SolrCore constructor just before the CountDownLatch is released.

What it does do however is ensure that when a new schema gets loaded later (by 
the REST API, or a schemaless update processor) and ResourceLoaderAware 
fieldtypes/analyzers are good to go -- but that doesn't do anything to help 
SolrCoreAware plugins like SimilarityFactory.

I'm attaching a work in progress patch where I attempted to fix the underlying 
problem with SolrResourceLoader by having it keep a refrence to the SolrCore 
it's tied to such that any new instances after that the would be immediately 
informed of the SorlCore/ResourceLoader.  This fixes some of the tests I 
mentioned before in this issue that have problems with SchemaSimilarityFactory 
but causes other failures in other existing test that reload the schema -- 
because any FieldType that is ResourceLoader aware is now being "informed" of 
the loader as soon as it's instantiated -- before even basic init() methods are 
called.  Which makes sense in hind sight -- my whole approach here is flawed 
because the contract is suppoes to be that the init methods will always be 
called first, and any (valid) inform methods will be called at some point after 
that once the core/loader is available, but before the instance is used ... 
calling "new" then "inform" then "init" is maddness.

I honestly don't know if there is a sane way to solve this problem in the 
general case -- the best thing I can think of at the moment is a similar 
special case hack for calling {{loader.inform(SolrCore)}} after any code that 
creates a schema (other then SolrCore)

> TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use 
> SolrCoreAware sim factory: SchemaSimilarityFactory 
> ---
>
> Key: SOLR-8280
> URL: https://issues.apache.org/jira/browse/SOLR-8280
> Project: Solr
>  Issue Type: Bug
>Affects Versions: Trunk
>Reporter: Hoss Man
> Attachments: SOLR-8280.patch, SOLR-8280.patch, 
> SOLR-8280__broken__resource_loader_experiment.patch
>
>
> Something about the code path(s) involved in TestCloudSchemaless & 
> ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is 
> SolrCoreAware -- notably: SchemaSimilarityFactory.
> I discovered this while trying to implement SOLR-8271, but it can be 
> reproduced trivially by modifying the 
> schema-add-schema-fields-update-processor.xml file used by 
> TestCloudSchemaless (and hardcoded in java schema used by 
> ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly.  
> Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: 
> TestCloudManagedSchema) tests don't seem to demonstrate the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8280) TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use SolrCoreAware sim factory: SchemaSimilarityFactory

2015-11-11 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-8280:
---
Description: 
Something about the code path(s) involved in TestCloudSchemaless & 
ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is 
SolrCoreAware -- notably: SchemaSimilarityFactory.

I discovered this while trying to implement SOLR-8271, but it can be reproduced 
trivially by modifying the schema-add-schema-fields-update-processor.xml file 
used by TestCloudSchemaless (and hardcoded in java schema used by 
ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly.  Other 
cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: 
TestCloudManagedSchema) tests don't seem to demonstrate the same problem.


  was:
Something about the code path(s) involved in TestCloudSchemaless doesn't play 
nicely with a SimilarityFactory that is SolrCoreAware -- notably: 
SchemaSimilarityFactory.

I discovered this while trying to implement SOLR-8271, but it can be reproduced 
trivially by modifying the schema-add-schema-fields-update-processor.xml file 
used by TestCloudSchemaless to refer to SchemaSimilarityFactory explicitly.  
Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: 
TestCloudManagedSchema) tests don't seem to demonstrate the same problem.


Summary: TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if 
you try to use SolrCoreAware sim factory: SchemaSimilarityFactory   (was: 
TestCloudSchemaless fails weirdly if you try to use SolrCoreAware sim factory: 
SchemaSimilarityFactory )

> TestCloudSchemaless + ChangedSchemaMergeTest fail weirdly if you try to use 
> SolrCoreAware sim factory: SchemaSimilarityFactory 
> ---
>
> Key: SOLR-8280
> URL: https://issues.apache.org/jira/browse/SOLR-8280
> Project: Solr
>  Issue Type: Bug
>Affects Versions: Trunk
>Reporter: Hoss Man
> Attachments: SOLR-8280.patch, SOLR-8280.patch
>
>
> Something about the code path(s) involved in TestCloudSchemaless & 
> ChangedSchemaMergeTest don't play nicely with a SimilarityFactory that is 
> SolrCoreAware -- notably: SchemaSimilarityFactory.
> I discovered this while trying to implement SOLR-8271, but it can be 
> reproduced trivially by modifying the 
> schema-add-schema-fields-update-processor.xml file used by 
> TestCloudSchemaless (and hardcoded in java schema used by 
> ChangedSchemaMergeTest) to refer to SchemaSimilarityFactory explicitly.  
> Other cloud tests (such as CollectionReloadTest) or cloud+schemaless (ex: 
> TestCloudManagedSchema) tests don't seem to demonstrate the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org