I'm working on SOLR-3251 <https://issues.apache.org/jira/browse/SOLR-3251>, to 
dynamically add fields to the Solr schema.

I posted a rough outline of how I propose to do this: 
<https://issues.apache.org/jira/browse/SOLR-3251?focusedCommentId=13572875&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13572875>.
  

So far, I've finished the first item - schema information REST requests, along 
with Restlet integration (moved up from the last item in the outline) - in 
SOLR-4503 <https://issues.apache.org/jira/browse/SOLR-4503>.

There are two specific concerns that I'd like feedback on: 1) schema 
serialization format and 2) disabling/enabling runtime schema modifications via 
REST API calls.  (I'd also be happy to get feedback on other aspects of this 
feature!)


1) Item #2 on the outline ("Change Solr schema serialization from XML to JSON, 
and provide an XML->JSON conversion tool") seems like it might be 
controversial, in that using JSON as the serialization format implies that Solr 
owns the configuration, and that direct user modification would no longer be 
the standard way to change the schema.

For most users, if a change is to be made, the transition will be an issue.  I 
think a hard break is off the table: whatever else happens, Solr will need to 
continue to be able to parse schema.xml, at least for all of 4.X and maybe 5.X 
too.

Two possible approaches:

a. When schema.xml is present, schema.json (if any) will be ignored.  Users 
could in this way signal whether dynamic schema modification is enabled: the 
presence of schema.xml indicates that the dynamic schema modification feature 
will be disabled.

b. Alternatively, the reverse: when schema.json is present, schema.xml will be 
ignored.  The first time schema.xml is found but schema.json isn't, schema.xml 
is automatically converted to schema.json.

I like option a. better, since it provides a stable situation for users who 
don't want the new dynamic schema modification feature, and who want to 
continue to hand edit schema.xml.  Users who want the new feature would use a 
command-line tool to convert their schema.xml to schema.json, then remove 
schema.xml from conf/.


2) Since the REST APIs to modify the schema will not be registerable 
RequestHandlers, there is no plan (yet) to disable schema modification 
requests.  Three possibilities come to mind:

a. A configuration setting in solrconfig.xml - this would be changeable only 
after restarting a node, e.g. top-level <schema mutable="true/false"/> 

b. A REST API call that allows for runtime querying and setting of the mutable 
status, http://localhost:8983/solr/schema/status would return current status, 
and adding query "?mutable=true/false" would change it.

c. A combination of the above two: a configuration item in solrconfig.xml to 
enable the REST API, e.g. <schema enableMutable="true/false"/>, and then a REST 
API to query current status and dis/allow modifications at runtime: 
/solr/schema/status for current mutable status, and with query 
"?mutable=true/false" to change it.  The mutable status would always be false 
at startup, so the flow to make modifications would involve first making a REST 
PUT to "/solr/schema?mutable=true"

I like option c. the best, since it would address concerns of users who don't 
want the schema to be modifiable.


I look forward to hearing others' thoughts on these and any other issues 
related to dynamic schema modification.

Thanks,
Steve

Reply via email to