To revisit sarowes comment about how/when to decide if we are using the   
"config file" version of schema info (and hte API is read only) vs
"internal managed state data" version of schema info (and the API is
read/write)...

On Wed, 6 Mar 2013, Steve Rowe wrote:

: Two possible approaches:
: 
: a. When schema.xml is present, ...
        ...
: b. Alternatively, the reverse: ...
        ...
: I like option a. better, since it provides a stable situation for users 
: who don't want the new dynamic schema modification feature, and who want 
: to continue to hand edit schema.xml.  Users who want the new feature 
: would use a command-line tool to convert their schema.xml to 
: schema.json, then remove schema.xml from conf/.


The more i think about it, the less I like either "a" or "b" because both 
are completley implicit.

I think practically speaking, from a support standpoint, we should require 
an more explicit configuration of what *type* of schema management 
should be used, and then have code that sanity checks that and warns/fails 
if the configuraiton setting doesn't match what is found in the ./conf 
dir.

The situation i worry about, is whan a novice solr user takes over 
maintence of an existing setup that is using REST based schema management, 
and therefore has no schema.xml file.  The novice is reading 
docs/tutorials talking about how to achieve some goal, which make refrence 
to "editing the schema.xml" or "adding XXX to the schema.xml" or even 
worse in the cases of some CMSs: "To upgrade to FooCMS vX.Y, replace your 
schema.xml with this file..." but they have no schema.xml or any clear and 
obvious indication looking at what configs they do have of *why* there is 
no schema.xml, so maybe they try to add one.

I think it would be better to add some new option in solroconfig.xml that 
requires the user to be explicit about what type of management thye want 
to use, defaulting to schema.xml for back compat...

  <schema type="conf" 
          [maybe an optional file="path/to/schema.xml" ?] />

...vs...

  <schema type="managed" 
          [this is where the mutable="true|false" sarowe mentioned could live] 
/>

The on core load:

1) if the configured schema type is "file" but there is no schema.xml 
file, ERROR loudly and fail fast.

2) if we see that the the configured schema type is "file" but we detected 
the existence of "managed" internal schema info (schema.json, zk nodes, 
whatever) then we should WARN that the managed internal data is being 
ignored.

3) if the configured schema type is "managed" but there is no manged 
internal schema info (schema.json, zk nodes, whatever) then ERROR loudly 
and fail fase (or maybe we create an empty schema for them?)

4) if we see that the the configured schema type is "managed" but we 
also detected the existence of a "schema.xml" config file, then
whatever) then we should WARN that the schema.xml is being 
ignored.

...although i could easily be convinced that all of those "WARN" 
sitautions should really be hard failures to reduce confusion -- depends 
on how easy we can make it to let users delete all internally manged 
schema info before switching to a type="conf" schema.xml approach.


-Hoss

Reply via email to