[ 
https://issues.apache.org/jira/browse/SOLR-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henri Biestro updated SOLR-646:
-------------------------------

    Description: 
This patch refers to 'generalized configuration properties' as specified by 
[HossMan|https://issues.apache.org/jira/browse/SOLR-350?focusedCommentId=12562834#action_12562834]
This means configuration & schema files can use expression based on properties 
defined in multicore.xml.

h3. Use cases:
Describe core data directories from solr.xml as properties.
Share the same schema and/or config file between multiple cores.
Share reusable fragments of schemar & configuration between multiple cores.

h3. Usage:
h4. solr.xml
This*solr.xml* will be used to illustrates using properties for different 
purpose.
{code:xml}
<solr persistent="true">

  <property name="version">1.3</property>
  <property name="lang">english, french</property>
  <property name="en-cores">en,core0</property>
  <property name="fr-cores">fr,core1</property>

  <cores adminPath="/admin/cores">
    <core name="${en-cores}" instanceDir="./">
          <property name="version">3.5</property>
          <property name="l10n">EN</property>
          <property name="ctlField">core0</property>
          <property name="comment">This is a sample</property>
        </core>

    <core name="${fr-cores}" instanceDir="./">
          <property name="version">2.4</property>
          <property name="l10n">FR</property>
          <property name="ctlField">core1</property>
          <property name="comment">Ceci est un exemple</property>
        </core>
  </cores>
</solr>
{code}
{{version}} : if you update your solr.xml or your cores for various motives, it 
can be useful to track of a version. In this example, this will be used to 
define the {{dataDir}} for each core.
{{en-cores}},{{fr-cores}}: with aliases, if the list is long or repetitive, it 
might be convenient to use a property that can then be used to describe the 
Solr core name.
{{instanceDir}}: note that both cores will use the same instance directory, 
sharing their configuration and schema. The {{dataDir}} will be set for each of 
them from the *solrconfig.xml*.

h4. solrconfig.xml
This is where our *solr.xml* property are used to define the data directory as 
a composition of, in our example, the language code {{l10n}} and the core 
version stored in {{version}}.
{code:xml}
<config>
  <dataDir>${solr.solr.home}/data/${l10n}-${version}</dataDir>
....
</config>
{code}

h5. schema.xml
The {{include}} allows to import a file within the schema (or a solrconfig); 
this can help de-clutter long schemas.
The {{ctlField}} is just illustrating that a field & its type can be set 
through properties as well; in our example, we will want the 'english' core to 
refer to an 'english-configured' field and the 'french' core to a 
'french-configured' one. The type for the field is defined as {{text-EN}} or 
{{text-FR}} after expansion.

{code:xml}
<schema name="example core ${l10n}" version="1.1">
  <types>
...
   <include resource="text-l10n.xml"/>
  </types>

 <fields>   
...
  <field name="${ctlField}"   type="text-${l10n}"   indexed="true"  
stored="true"  multiValued="true" /> 
 </fields>
{code}

This schema is importing this* text-l10n.xml* file which is a *fragment*; the 
fragment tag must be present & indicates the file is to be included. Our 
example only defines different stopwords for each language but you could of 
course extend this to stemmers, synonyms, etc.
{code:xml}
<fragment>
        <fieldType name="text-FR" class="solr.TextField" 
positionIncrementGap="100">
...
            <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords-fr.txt"/>
...
        </fieldType>
        <fieldType name="text-EN" class="solr.TextField" 
positionIncrementGap="100">
...
            <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords-en.txt"/>
...
        </fieldType>
</fragment>
{code}


h4. Technical specifications
solr.xml can define properties at the multicore & each core level.
Properties defined in the multicore scope can override system properties.
Properties defined in a core scope can override multicore & system properties.
Property definitions can use expressions to define their name & value; these 
expressions are evaluated in their outer scope context .
CoreContainer serialization keeps properties as defined; persistence is 
idem-potent. (ie property expressions are written, not their evaluation).

The core descriptor properties are automatically defined in each core context, 
namely:
solr.core.instanceDir
solr.core.name
solr.core.configName
solr.core.schemaName

h3. Coding notes:

DOMUtil.java:
refactored substituteSystemProperties to use an Evaluator;
an Evaluator is a DOM visitor that expands property expressions "in place" 
using a property map as an evaluation context
added an asString(node) method for logging purpose

CoreDescriptor.java:
added an expression member to keep property expressions as defined in solr.xml 
for persistence - allowing to write file as defined (not as expanded)

CoreContainer.java:
add an expression member to keep property expression as defined in solr.xml for 
persistence - allowing to write file as defined (not as expanded);
solrx.xml peristence is idem-potent
added a local DOMUtil.Evaluator that tracks property expressions to evaluate & 
store them
issues outlined through solr-646:
fix in load: 
CoreDescriptor p = new CoreDescriptor(this, names, ....);
was: CoreDescriptor p = new CoreDescriptor(this, name, ...);
fix in load;
register(aliases.get(a), core, false);
was of register(aliases.get(i), core, false);

CoreAdminHandler.java
added an optional fileName to persist so it is possible to write the solr.xml 
to a different file (for comparison purpose)

CoreAdminRequest.java
added PersistRequest to allow passing optional fileName

Config.java:
subsituteProperties has been moved out of constructor & doc member made 
protected to allow override
added an IncludesEvaluator that deals with include/fragment

SolrConfig.java & IndexSchema.ava
added explicit calls to substituteProperties to perform property/include 
expansion

SolrResourceLoader.java
added properties member to store CoreContainer & per-SolrCore properties
added constructor properties parameter & getter for properties
added initial property loading allowing installation wide properties & 
solr.solr.home to be defined in a 'solr.properties' file

SolrProperties.java:
test inspired by MulticoreExampleTestBase.java
loads 2 cores sharing a schema & config;
config define dataDir using a property
schema uses a localization (l10n) property to define an attribute
persists the file to check it keeps the expression properties



  was:
This patch refers to 'generalized configuration properties' as specified by 
[HossMan|https://issues.apache.org/jira/browse/SOLR-350?focusedCommentId=12562834#action_12562834]
This means configuration & schema files can use expression based on properties 
defined in multicore.xml.

h3. Use cases:
Share the same schema and/or config file between multiple cores but point to 
different dataDirs using a <dataDir>${core.datadir}</dataDir>
Change the general layout between data, config & schema directories.
Define 'installation' wide properties (for replication for instance) in 
multicore.properties (different base/install/data directories on different 
hosts).

h3. Syntax:
h4. Defining properties:
Either in the multicore.properties file (usual property format):
{code:xml}
env.value=an installation value
env.other_value=another installation value
{code}
In the multicore.xml:
{code:xml}
<multicore'>
  <property name='mp0'>a value</property>  <!-- a basic property -->
  <property name='mp1'>${env.value}</property>  }<!-- a property whose value is 
an expression -->
  <core name='core_name'>
     <property name='p0'>some value</property>
     <property name='p1'>${mp0}-and some data</property>
  </core>
</multicore>
{code}
h4. Using properties:
Besides used defined properties, the following core descriptor properties are 
automatically defined in each core context, namely:
{code}
solr.core.instanceDir
solr.core.instancePath
solr.core.name
solr.core.configName
solr.core.schemaName
{code}
All properties can be used in any attribute or text node of a solrconfig.xml or 
schema.xml as in:
{code:xml}
<dataDir>${core.dataDir}</dataDir>
{code}

h4. Technical specifications
Multicore.xml can define properties at the multicore & each core level.
Properties defined in the multicore scope can override system properties.
Properties defined in a core scope can override multicore & system properties.
Property definitions can use expressions to define their name & value; these 
expressions are evaluated in their outer scope context .
Multicore serialization keeps properties as written (ie as expressions if they 
were defined so).

The core descriptor properties are automatically defined in each core context, 
namely:
solr.core.instanceDir
solr.core.instancePath
solr.core.name
solr.core.configName
solr.core.schemaName

h4. Test:
The following (contrived) multicore.xml:

{code:xml}
<multicore adminPath='/admin/multicore' persistent='true'>
  <property name='revision'>33</property>  <!-- a basic property -->
  <property name='zero'>0</property>  <!-- used to expand the core0 name  -->
  <property name='one'>1</property>  <!-- used to expand the core1 name  -->
  <property name='id_type'>bogus</property>  <!-- a bogus type that will be 
overriden  -->
  <property name='updateHandler'>bogus</property>  <!-- a bogus updateHandler 
that will be overriden  -->
  <core name='core${zero}' instanceDir='core0/'>    <!-- the name is expanded 
-->
    <property name='id_type'>core${zero}_id</property> <!-- so is a text node 
-->
    <property name='updateHandler'>solr.DirectUpdateHandler2</property> <!-- a 
property can be overriden -->
    <property name='revision'>11</property>
  </core>
  <core name='core${one}' instanceDir='core1/'>
    <property name='id_type'>core${one}_id</property>
    <property name='updateHandler'>solr.DirectUpdateHandler2</property>
    <property name='revision'>22</property>
  </core>
</multicore>
{code}

allows this config.xml:

{code:xml}
<config>
<!-- use the defined update handler property -->
  <updateHandler class="${updateHandler}" />

  <requestDispatcher handleSelect="true" >
    <requestParsers enableRemoteStreaming="false" 
multipartUploadLimitInKB="2048" />
  </requestDispatcher>
  
  <requestHandler name="standard" class="solr.StandardRequestHandler" 
default="true" />
  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
  <requestHandler name="/admin/luke"       
class="org.apache.solr.handler.admin.LukeRequestHandler" />
  
  <!-- config for the admin interface --> 
  <admin>
    <defaultQuery>solr</defaultQuery>
    <gettableFiles>solrconfig.xml schema.xml admin-extra.html</gettableFiles>
    <pingQuery>
     qt=standard&amp;q=solrpingquery
    </pingQuery>
  </admin>

</config>
{code}

and this schema.xml:

{code:xml}
<schema name="example core zero" version="1.1">
  <types>
   <!-- define a type name dynamically -->
    <fieldtype name="${id_type:id_t}"  class="solr.StrField" 
sortMissingLast="true" omitNorms="true"/>
    <fieldtype name="string"  class="solr.StrField" sortMissingLast="true" 
omitNorms="true"/>
  </types>

 <fields>   
  <!-- the type of unique key defined above -->
  <field name="id"      type="${id_type:id_t}"   indexed="true"  stored="true"  
multiValued="false" required="true"/>
  <field name="type"    type="string"   indexed="true"  stored="true"  
multiValued="false" /> 
  <field name="name"    type="string"   indexed="true"  stored="true"  
multiValued="false" /> 
  <field name="${solr.core.name:core}"   type="string"   indexed="true"  
stored="true"  multiValued="false" /> 
 </fields>
 <uniqueKey>id</uniqueKey>
 <defaultSearchField>name</defaultSearchField>
 <solrQueryParser defaultOperator="OR"/>
</schema>
{code}

Allow the trunk test to work.

h3. Coding notes:
The code itself refactored some of DOMUtil (the ant based property 
substitution) into one added class (PropertyMap & PropertyMap.Evaluator).
The PropertyMap are chained (one link chain between core to multicore map); 
those maps are owned by each core's ResourceLoader.
Config is modified a little to accommodate delaying & specializing property 
expansions.
Multicore is modified so it properly parses & serializes.
Tested against the example above.

Reviews & comments more than welcome.


> Configuration properties in multicore.xml
> -----------------------------------------
>
>                 Key: SOLR-646
>                 URL: https://issues.apache.org/jira/browse/SOLR-646
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Henri Biestro
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 1.3
>
>         Attachments: solr-646.patch, SOLR-646.patch, solr-646.patch, 
> solr-646.patch, solr-646.patch, solr-646.patch, solr-646.patch
>
>
> This patch refers to 'generalized configuration properties' as specified by 
> [HossMan|https://issues.apache.org/jira/browse/SOLR-350?focusedCommentId=12562834#action_12562834]
> This means configuration & schema files can use expression based on 
> properties defined in multicore.xml.
> h3. Use cases:
> Describe core data directories from solr.xml as properties.
> Share the same schema and/or config file between multiple cores.
> Share reusable fragments of schemar & configuration between multiple cores.
> h3. Usage:
> h4. solr.xml
> This*solr.xml* will be used to illustrates using properties for different 
> purpose.
> {code:xml}
> <solr persistent="true">
>   <property name="version">1.3</property>
>   <property name="lang">english, french</property>
>   <property name="en-cores">en,core0</property>
>   <property name="fr-cores">fr,core1</property>
>   <cores adminPath="/admin/cores">
>     <core name="${en-cores}" instanceDir="./">
>         <property name="version">3.5</property>
>         <property name="l10n">EN</property>
>         <property name="ctlField">core0</property>
>         <property name="comment">This is a sample</property>
>       </core>
>     <core name="${fr-cores}" instanceDir="./">
>         <property name="version">2.4</property>
>         <property name="l10n">FR</property>
>         <property name="ctlField">core1</property>
>         <property name="comment">Ceci est un exemple</property>
>       </core>
>   </cores>
> </solr>
> {code}
> {{version}} : if you update your solr.xml or your cores for various motives, 
> it can be useful to track of a version. In this example, this will be used to 
> define the {{dataDir}} for each core.
> {{en-cores}},{{fr-cores}}: with aliases, if the list is long or repetitive, 
> it might be convenient to use a property that can then be used to describe 
> the Solr core name.
> {{instanceDir}}: note that both cores will use the same instance directory, 
> sharing their configuration and schema. The {{dataDir}} will be set for each 
> of them from the *solrconfig.xml*.
> h4. solrconfig.xml
> This is where our *solr.xml* property are used to define the data directory 
> as a composition of, in our example, the language code {{l10n}} and the core 
> version stored in {{version}}.
> {code:xml}
> <config>
>   <dataDir>${solr.solr.home}/data/${l10n}-${version}</dataDir>
> ....
> </config>
> {code}
> h5. schema.xml
> The {{include}} allows to import a file within the schema (or a solrconfig); 
> this can help de-clutter long schemas.
> The {{ctlField}} is just illustrating that a field & its type can be set 
> through properties as well; in our example, we will want the 'english' core 
> to refer to an 'english-configured' field and the 'french' core to a 
> 'french-configured' one. The type for the field is defined as {{text-EN}} or 
> {{text-FR}} after expansion.
> {code:xml}
> <schema name="example core ${l10n}" version="1.1">
>   <types>
> ...
>    <include resource="text-l10n.xml"/>
>   </types>
>  <fields>   
> ...
>   <field name="${ctlField}"   type="text-${l10n}"   indexed="true"  
> stored="true"  multiValued="true" /> 
>  </fields>
> {code}
> This schema is importing this* text-l10n.xml* file which is a *fragment*; the 
> fragment tag must be present & indicates the file is to be included. Our 
> example only defines different stopwords for each language but you could of 
> course extend this to stemmers, synonyms, etc.
> {code:xml}
> <fragment>
>       <fieldType name="text-FR" class="solr.TextField" 
> positionIncrementGap="100">
> ...
>           <filter class="solr.StopFilterFactory" ignoreCase="true" 
> words="stopwords-fr.txt"/>
> ...
>       </fieldType>
>       <fieldType name="text-EN" class="solr.TextField" 
> positionIncrementGap="100">
> ...
>           <filter class="solr.StopFilterFactory" ignoreCase="true" 
> words="stopwords-en.txt"/>
> ...
>       </fieldType>
> </fragment>
> {code}
> h4. Technical specifications
> solr.xml can define properties at the multicore & each core level.
> Properties defined in the multicore scope can override system properties.
> Properties defined in a core scope can override multicore & system properties.
> Property definitions can use expressions to define their name & value; these 
> expressions are evaluated in their outer scope context .
> CoreContainer serialization keeps properties as defined; persistence is 
> idem-potent. (ie property expressions are written, not their evaluation).
> The core descriptor properties are automatically defined in each core 
> context, namely:
> solr.core.instanceDir
> solr.core.name
> solr.core.configName
> solr.core.schemaName
> h3. Coding notes:
> DOMUtil.java:
> refactored substituteSystemProperties to use an Evaluator;
> an Evaluator is a DOM visitor that expands property expressions "in place" 
> using a property map as an evaluation context
> added an asString(node) method for logging purpose
> CoreDescriptor.java:
> added an expression member to keep property expressions as defined in 
> solr.xml for persistence - allowing to write file as defined (not as expanded)
> CoreContainer.java:
> add an expression member to keep property expression as defined in solr.xml 
> for persistence - allowing to write file as defined (not as expanded);
> solrx.xml peristence is idem-potent
> added a local DOMUtil.Evaluator that tracks property expressions to evaluate 
> & store them
> issues outlined through solr-646:
> fix in load: 
> CoreDescriptor p = new CoreDescriptor(this, names, ....);
> was: CoreDescriptor p = new CoreDescriptor(this, name, ...);
> fix in load;
> register(aliases.get(a), core, false);
> was of register(aliases.get(i), core, false);
> CoreAdminHandler.java
> added an optional fileName to persist so it is possible to write the solr.xml 
> to a different file (for comparison purpose)
> CoreAdminRequest.java
> added PersistRequest to allow passing optional fileName
> Config.java:
> subsituteProperties has been moved out of constructor & doc member made 
> protected to allow override
> added an IncludesEvaluator that deals with include/fragment
> SolrConfig.java & IndexSchema.ava
> added explicit calls to substituteProperties to perform property/include 
> expansion
> SolrResourceLoader.java
> added properties member to store CoreContainer & per-SolrCore properties
> added constructor properties parameter & getter for properties
> added initial property loading allowing installation wide properties & 
> solr.solr.home to be defined in a 'solr.properties' file
> SolrProperties.java:
> test inspired by MulticoreExampleTestBase.java
> loads 2 cores sharing a schema & config;
> config define dataDir using a property
> schema uses a localization (l10n) property to define an attribute
> persists the file to check it keeps the expression properties

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to