Re: Dynamic schema design: feedback requested

Mark Miller Wed, 06 Mar 2013 11:50:13 -0800

Hmm…I think I'm missing some pieces.

I agree with Erick that you should be able to load a schema from any object - a 
DB, a file in ZooKeeper, you name it. But it seems by default, having that 
object be schema.xml seems nicest to me. That doesn't mean you have to use DOM 
or XML internally - just that you have a serializer/deserializer for it. If you 
wanted to do it from a database, that would just be another 
serialize/deserialze impl. Internally, it could all be JSON or Java objects, or 
whatever.


As far as a user editing the file AND rest API access, I think that seems fine. 
Yes, the user is in trouble if they break the file, but that is the risk they 
take if they want to manually edit it - it's no different than today when you 
edit the file and do a Core reload or something. I think we can improve some 
validation stuff around that, but it doesn't seem like a show stopper to me.

At a minimum, I think the user should be able to start with a hand modified 
file. Many people *heavily* modify the example schema to fit their use case. If 
you have to start doing that by making 50 rest API calls, that's pretty rough. 
Once you get your schema nice and happy, you might script out those rest calls, 
but initially, it's much faster/easier to whack the schema into place in a text 
editor IMO.

Like I said though, I may be missing something…

- Mark

On Mar 6, 2013, at 11:17 AM, Steve Rowe <sar...@gmail.com> wrote:

> In response to my thoughts about using DOM as an intermediate representation 
> for schema elements, for use in lazy re-loading on schema change, Erik 
> Hatcher argued against (solely) using XML for schema serialization 
> (<https://issues.apache.org/jira/browse/SOLR-3251?focusedCommentId=13571631&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13571631>):
> 
>       IMO - The XMLness of the current Solr schema needs to be isolated
>       to only one optional way of constructing an IndexSchema instance.
>       We want less XML rather than more. (for example, it should be
>       possible to have a relational database that contains a model of
>       a schema and load it that way)
> 
> I was hoping to avoid dealing with round-tripping XML comments (of which 
> there are many in schema.xml).  My thought was that an XML->JSON conversion 
> tool would insert "description" properties on the enclosing/adjacent object 
> when it encounters comments.  But I suppose the same process could be applied 
> to schema.xml: XML comments could be converted to <description> elements, and 
> then when serializing changes, any user-inserted comments would be stripped.
> 
> The other concern is about schema "ownership": dealing with schemas that mix 
> hand-editing with Solr modification/serialization would likely be harder than 
> supporting just one of them.  But I suppose there is already a set of 
> validity checks, so maybe this wouldn't be so bad? 
> 
> Steve
> 
> On Mar 6, 2013, at 1:35 PM, Mark Miller <markrmil...@gmail.com> wrote:
> 
>> bq. Change Solr schema serialization from XML to JSON, and provide an 
>> XML->JSON conversion tool.
>> 
>> What is the motivation for the change? I think if you are sitting down and 
>> looking to design a schema, working with the XML is fairly nice and fast. I 
>> picture that a lot of people would start by working with the XML file to get 
>> it how they want, and then perhaps do future changes with the rest API. When 
>> you are developing, starting with the rest API feels fairly cumbersome if 
>> you have to make a lot of changes/additions/removals.
>> 
>> So why not just keep the XML and add the rest API? Do we gain much by 
>> switching it to JSON? I like JSON when it comes to rest, but when I think 
>> about editing a large schema doc locally, XML seems much easier to deal with.
>> 
>> - Mark
>

Re: Dynamic schema design: feedback requested

Reply via email to