[jira] [Commented] (SOLR-10229) See what it would take to shift many of our one-off schemas used for testing to managed schema and construct them as part of the tests

Amrit Sarkar (JIRA) Thu, 13 Apr 2017 09:20:06 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967813#comment-15967813
 ]


Amrit Sarkar commented on SOLR-10229:
-------------------------------------

Thank you Erick for the detailed feedback,

I have rectified the first three nits you mentioned and will upload new patch 
soon.

bq. nit4: Do the <copyField> directives in mother-schema make sense?
I know and was skeptical adding the feature but there is no harm providing the 
feature considering it is not taking that much code space in the framework. I 
can remove this if it is insignificant.

bq. I'm a little uncomfortable with the build(h.getCore) method being called 
for every field type addition and every field addition. I'm not sure how much 
work that entails, will we be reloading the schema again and again? Perhaps Use 
account "steve_rowe" instead or Hoss Man can weigh in (or I'll look more in the 
morning). WDYT about this kind of pattern?

We are not reloading the schema, but adding a new field to the list in present 
schema already available to core. It is atomic and doesn't require reloading. 
core.setLatestSchema(newSchema) is setting at the core level before informing 
the core regarding the SimilarityFactory, which doesn't concern with what we 
are doing here. More insights on this will be better.

bq. Ditto for Fields.... As I said, I really don't know whether this'll be more 
efficient or not, think of this as a marker to make sure we answer before 
diving in totally.

This is good I suppose, passing SolrCore again and again doesn't seem really 
good way and wrapping it in a list and passing it once seems good. *Also 
unsetting and resetting of schema will be saved.* I will add that. Though I 
didn't modify the TestMaxTokenLenTokenizer at this moment with 
factory.addNewFields(...) and factory.addNewFieldTypes(...) and post that patch 
on LUCENE-7705 for your feedback and changes.

bq. WDYT about splitting the 7705 bits out (we'll have to commit this one 
before 7705) and adding this patch with other examples?

I added that to have your feedback on how it will show up in an actual test 
suite. I will remove it.

bq. Along the way, by replacing the uses of at least two uses of a schema we 
can also make certain some test harness trickery isn't causing us to reload the 
mother-schema for every test suite...

Unless and until, it forces TestSchemaFrameworkFactory object already created 
to get destroyed, we are good!

I will post respective patches on the two jiras shortly. 



> See what it would take to shift many of our one-off schemas used for testing 
> to managed schema and construct them as part of the tests
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-10229
>                 URL: https://issues.apache.org/jira/browse/SOLR-10229
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: SOLR-10229.patch, SOLR-10229.patch, SOLR-10229.patch, 
> SOLR-10229.patch, SOLR-10229.patch
>
>
> The test schema files are intimidating. There are about a zillion of them, 
> and making a change in any of them risks breaking some _other_ test. That 
> leaves people three choices:
> 1> add what they need to some existing schema. Which makes schemas bigger and 
> bigger and bigger.
> 2> create a new schema file, adding to the proliferation thereof.
> 3> Look through all the existing tests to see if they have something that 
> works.
> The recent work on LUCENE-7705 is a case in point. We're adding a maxLen 
> parameter to some tokenizers. Putting those parameters into any of the 
> existing schemas, especially to test < 255 char tokens is virtually 
> guaranteed to break other tests, so the only safe thing to do is make another 
> schema file. Adding to the multiplication of files.
> As part of SOLR-5260 I tried creating the schema on the fly rather than 
> creating a new static schema file and it's not hard. WDYT about making this 
> into some better thought-out utility? 
> At present, this is pretty fuzzy, I wanted to get some reactions before 
> putting much effort into it. I expect that the utility methods would 
> eventually get a bunch of canned types. It's reasonably straightforward for 
> primitive types, if lengthy. But when you get into solr.TextField-based types 
> it gets less straight-forward.
> We could manage to just move the "intimidation" from the plethora of schema 
> files to a zillion fieldTypes in the utility to choose from...
> Also, forcing every test to define the fields up-front is arguably less 
> convenient than just having _some_ canned schemas we can use. And erroneous 
> schemas to test failure modes are probably not very good fits for any such 
> framework.
> [~steve_rowe] and [~hossman_luc...@fucit.org] in particular might have 
> something to say.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10229) See what it would take to shift many of our one-off schemas used for testing to managed schema and construct them as part of the tests

Reply via email to