[jira] [Commented] (SOLR-10229) See what it would take to shift many of our one-off schemas used for testing to managed schema and construct them as part of the tests

Erick Erickson (JIRA) Sat, 01 Apr 2017 10:41:56 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952316#comment-15952316
 ]


Erick Erickson commented on SOLR-10229:
---------------------------------------

bq: If you look closely at the public methods exposed to be used, all are 
static and h.getCore each time will fetch the current test-suites core and its 
schema, which is correct, no

h.getCore() is overly restrictive and doesn't support having more than one core 
open and modifying the schema. The problem is it fetches _the_ test core which 
is limiting. It's convenient for writing tests that only operate on a single 
core. For more complex situations it's quite restrictive.

Take a look at, for instance, TestLazyCores. It has to do some fancy dancing, 
but it opens multiple cores so it has to bypass h.getCore() completely. 
Admittedly they all use the same schema, but that doesn't matter since if I 
wanted each of those cores to have new field definitions I couldn't use 
h.getCore(), even implicitly. Even if all the new field definitions were the 
same.

bq: ...different cores with different schemas in the same test in our 
test-suites... Are there such use cases? 

Not that I know of offhand, but that doesn't mean anything really, there's a 
_lot_ of test code ;). It's unnecessarily restrictive to confine ourselves into 
that paradigm though. And as above, using h.getCore() doesn't allow modifying 
schemas for more than one core in any given test.

bq:  I will do repetitive forced testing for two or more test suites 
simultaneously and observe what's happening.

This isn't quite the issue. If we try to persist _anything_ to the "source 
tree", which includes all the config files in this case, the test framework 
should throw an exception. I'm not worried about multiple cores making 
modifications to the on-disk files, _no_ mods should be allowed unless the 
configs are in a temp dir. You'll see lots of code like (again from 
TestLazyCores since I know that code):

    solrHomeDirectory = createTempDir().toFile();
    File coreRoot = new File(solrHomeDirectory, coreName);
    copyMinConf(coreRoot, "name=" + coreName);

so having the temp dir (which is automagically cleaned up by the test harness) 
is required to change anything on-disk and just to use this new approach 
shouldn't require creating a tmp dir and copying stuff to it.

> See what it would take to shift many of our one-off schemas used for testing 
> to managed schema and construct them as part of the tests
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-10229
>                 URL: https://issues.apache.org/jira/browse/SOLR-10229
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: SOLR-10229.patch
>
>
> The test schema files are intimidating. There are about a zillion of them, 
> and making a change in any of them risks breaking some _other_ test. That 
> leaves people three choices:
> 1> add what they need to some existing schema. Which makes schemas bigger and 
> bigger and bigger.
> 2> create a new schema file, adding to the proliferation thereof.
> 3> Look through all the existing tests to see if they have something that 
> works.
> The recent work on LUCENE-7705 is a case in point. We're adding a maxLen 
> parameter to some tokenizers. Putting those parameters into any of the 
> existing schemas, especially to test < 255 char tokens is virtually 
> guaranteed to break other tests, so the only safe thing to do is make another 
> schema file. Adding to the multiplication of files.
> As part of SOLR-5260 I tried creating the schema on the fly rather than 
> creating a new static schema file and it's not hard. WDYT about making this 
> into some better thought-out utility? 
> At present, this is pretty fuzzy, I wanted to get some reactions before 
> putting much effort into it. I expect that the utility methods would 
> eventually get a bunch of canned types. It's reasonably straightforward for 
> primitive types, if lengthy. But when you get into solr.TextField-based types 
> it gets less straight-forward.
> We could manage to just move the "intimidation" from the plethora of schema 
> files to a zillion fieldTypes in the utility to choose from...
> Also, forcing every test to define the fields up-front is arguably less 
> convenient than just having _some_ canned schemas we can use. And erroneous 
> schemas to test failure modes are probably not very good fits for any such 
> framework.
> [~steve_rowe] and [~hossman_luc...@fucit.org] in particular might have 
> something to say.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10229) See what it would take to shift many of our one-off schemas used for testing to managed schema and construct them as part of the tests

Reply via email to