[ https://issues.apache.org/jira/browse/SOLR-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amrit Sarkar updated SOLR-10229: -------------------------------- Comment: was deleted (was: Updated patch: Refined builder methods for the Framework. TestKeywordTokenizer.java for LUCENE-7705 changed incorporating the framework and successfully able to implement it using declarative builder methods. Please note one thing regarding building FieldTypes: {code} framework.createNewFieldType().withName("keywordType"). withClassName("solr.TextField"). withAttribute("positionIncrementGap", "100"). withAttribute("analyzer", map("tokenizer", map("class", "solr.KeywordTokenizerFactory", "maxTokenLen", "3"))). build(h.getCore()); {code} For defining analyser nested map for attributes are declared, I believe this the correct way to do, seeking suggestions whether we want to handle them better. I am struggling with loading mother-schema in the framework. Following are the challenges and seeking advice on it: 1. I am trying to use the *_ManagedSchemaFactory.create()_* to load the mother schema, it needs a live Solrconfig object to do it. If I pass a dummy and able to create one, while we run a test and load core with an empty schema, the mother-schema gets replaced by the empty one. 2. The access level for the methods, constructors are restrictive. 2. I digged down to *readSchema(InputSource is)*, which effectively read the schema and fill the fields, fieldTypes, copyFields .... list into the core. If I refer IndexSchema directly to get the function *readSchema(InputSource is)*, it is immutable and hence the functions related to Schema API doesn't apply to them. Also _readSchema_ needs _SolrResourceLoader_ from _SolrConfig_, which should be one-off thing a the time of Framework creation. In the patch I commented out the loading of mother-schema, I am trying out different combinations, techniques to load them up. I am sure there is a way, seeking some pointers on them. I was also thinking about reading the schema in plain XML reader, though not sure it is a good way.) > See what it would take to shift many of our one-off schemas used for testing > to managed schema and construct them as part of the tests > -------------------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-10229 > URL: https://issues.apache.org/jira/browse/SOLR-10229 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Erick Erickson > Assignee: Erick Erickson > Priority: Minor > Attachments: SOLR-10229.patch, SOLR-10229.patch > > > The test schema files are intimidating. There are about a zillion of them, > and making a change in any of them risks breaking some _other_ test. That > leaves people three choices: > 1> add what they need to some existing schema. Which makes schemas bigger and > bigger and bigger. > 2> create a new schema file, adding to the proliferation thereof. > 3> Look through all the existing tests to see if they have something that > works. > The recent work on LUCENE-7705 is a case in point. We're adding a maxLen > parameter to some tokenizers. Putting those parameters into any of the > existing schemas, especially to test < 255 char tokens is virtually > guaranteed to break other tests, so the only safe thing to do is make another > schema file. Adding to the multiplication of files. > As part of SOLR-5260 I tried creating the schema on the fly rather than > creating a new static schema file and it's not hard. WDYT about making this > into some better thought-out utility? > At present, this is pretty fuzzy, I wanted to get some reactions before > putting much effort into it. I expect that the utility methods would > eventually get a bunch of canned types. It's reasonably straightforward for > primitive types, if lengthy. But when you get into solr.TextField-based types > it gets less straight-forward. > We could manage to just move the "intimidation" from the plethora of schema > files to a zillion fieldTypes in the utility to choose from... > Also, forcing every test to define the fields up-front is arguably less > convenient than just having _some_ canned schemas we can use. And erroneous > schemas to test failure modes are probably not very good fits for any such > framework. > [~steve_rowe] and [~hossman_luc...@fucit.org] in particular might have > something to say. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org