[
https://issues.apache.org/jira/browse/SOLR-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896786#comment-15896786
]
Erick Erickson commented on SOLR-10229:
---------------------------------------
First the managed schema stuff doesn't require SolrCloud, right? SOLR-5260 the
process method just takes a SolrClient even though it's passed a
CloudSolrClient. IIUC anyway.
re: SOLR-10117: I find that pretty hard to follow frankly. For any test that
had a bunch of fields to be added the setup code wold get humongous. I was
hoping to hide all that away in a utility class and have users be able to do
something in @BeforeClass like
load minimal managed schema (this is just fields)
addToSchema(canned_type1, name1, prop1, val1, prop2, val2.....)
addToSchema(canned_type2, name2, prop11, val11, prop22, val22.....)
.
.
.
updateSchema(SolrClient...)
Note that prop1, val1 and the like override the defaults for canned_type1 etc.
that _ought_ to work for both cloud and stand-alone...
The problem I ran into when I was faking this earlier today is that fieldTypes
aren't simple. While I haven't put much thought into it yet, the simple thing
we're trying to do for LUCENE-7055 is a case in point. I want a fieldType with
a new parameter set on several tokenizers. That, of course, is solved by the
technique in SOLR-10117, but to force every test to define it's own fieldTypes
when a vast majority are common seems like a bad tradeoff. OTOH, to pre-define
a boatload of fieldTypes in some utility makes that utility at least as hard to
follow as a huge schema file.
So I'm a little stuck on how to balance of making writing tests really painful
and reducing the necessity to create yet another schema file. Maybe there's no
good compromise...
> See what it would take to shift many of our one-off schemas used for testing
> to managed schema and construct them as part of the tests
> --------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-10229
> URL: https://issues.apache.org/jira/browse/SOLR-10229
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Erick Erickson
> Priority: Minor
>
> The test schema files are intimidating. There are about a zillion of them,
> and making a change in any of them risks breaking some _other_ test. That
> leaves people three choices:
> 1> add what they need to some existing schema. Which makes schemas bigger and
> bigger and bigger.
> 2> create a new schema file, adding to the proliferation thereof.
> 3> Look through all the existing tests to see if they have something that
> works.
> The recent work on LUCENE-7705 is a case in point. We're adding a maxLen
> parameter to some tokenizers. Putting those parameters into any of the
> existing schemas, especially to test < 255 char tokens is virtually
> guaranteed to break other tests, so the only safe thing to do is make another
> schema file. Adding to the multiplication of files.
> As part of SOLR-5260 I tried creating the schema on the fly rather than
> creating a new static schema file and it's not hard. WDYT about making this
> into some better thought-out utility?
> At present, this is pretty fuzzy, I wanted to get some reactions before
> putting much effort into it. I expect that the utility methods would
> eventually get a bunch of canned types. It's reasonably straightforward for
> primitive types, if lengthy. But when you get into solr.TextField-based types
> it gets less straight-forward.
> We could manage to just move the "intimidation" from the plethora of schema
> files to a zillion fieldTypes in the utility to choose from...
> Also, forcing every test to define the fields up-front is arguably less
> convenient than just having _some_ canned schemas we can use. And erroneous
> schemas to test failure modes are probably not very good fits for any such
> framework.
> [~steve_rowe] and [[email protected]] in particular might have
> something to say.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]