Re: Dynamic schema design: feedback requested

Jan Høydahl Wed, 06 Mar 2013 13:50:35 -0800

How will this all work with ZooKeeper and cloud?

Will ZK get pushed the serialized monolithic schema.xml / schema.json from the 
node which changed it, and then trigger an update to the rest of the cluster?


I was kind of hoping that once we have introduced ZK into the mix as our 
centralized config server, we could start using it as such consistently. And so 
instead of ZK storing a plain xml file, we split up the schema as native ZK 
nodes:

configs
 +configA
   +--schema
      +--version: 1.5
      +--fieldTypes
      |  +---text_en "tokenizer:"foo", filters: [{name: "foo", class: 
"solr.StrField"...}, {name: "bar"...}]}"
      |  +---text_no "tokenizer:"foo", filters: [{name: "foo", class: 
"solr.StrField"...}, {name: "bar"...}]}"
      +--fields
         +---title "..."

Then we or 3rd parties can build various tools to interact with the schema. 
Your REST service would read and update these manageable chunks in ZK, and it 
will all be in sync. It is also more 1:1 with how things are wired, multiple 
collections may share the same config set and thus schema, so what happens if 
someone does not know this and hits PUT localhost:8983/solr/collection1/schema 
and it affects also the schema for collection2? These relationships are already 
maintained in ZK.

I imagine we can do the same with solrconfig too. Split it up in small 
information pieces kept it ZK. Then SolrCloud can have a compat mode 
serializing this info as the old familiar files for those who need an export to 
plain singlenode or the opposite. Perhaps we can use ZK to keep N revisions 
too, so you could roll back a series of changes?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

6. mars 2013 kl. 19:35 skrev Mark Miller <markrmil...@gmail.com>:

> bq. Change Solr schema serialization from XML to JSON, and provide an 
> XML->JSON conversion tool.
> 
> What is the motivation for the change? I think if you are sitting down and 
> looking to design a schema, working with the XML is fairly nice and fast. I 
> picture that a lot of people would start by working with the XML file to get 
> it how they want, and then perhaps do future changes with the rest API. When 
> you are developing, starting with the rest API feels fairly cumbersome if you 
> have to make a lot of changes/additions/removals.
> 
> So why not just keep the XML and add the rest API? Do we gain much by 
> switching it to JSON? I like JSON when it comes to rest, but when I think 
> about editing a large schema doc locally, XML seems much easier to deal with.
> 
> - Mark

Re: Dynamic schema design: feedback requested

Reply via email to