That stuff isnt being used but is shipped by the default Solr example schema of early 4.0. 4.0's default schema has changed a lot now.
I'd rather ship a small nutch specific config without all the default Solr fieldTypes that aren't being used anyway. -----Original message----- > From:Lewis John Mcgibbney <lewis.mcgibb...@gmail.com> > Sent: Tue 07-Aug-2012 13:43 > To: Markus Jelsma <markus.jel...@openindex.io> > Cc: dev@nutch.apache.org > Subject: Re: Understanding mapping of field characteristics to index structure > > Hi Markus, > Thanks for getting back on this one last night. Please see comments inline. > > On Mon, Aug 6, 2012 at 11:12 PM, Markus Jelsma > <markus.jel...@openindex.io> wrote: > > Hi, > > Tokenization depens whether an analyzer used for the field ... should be > > boosted seperately. > > > > Thanks for clarifying all is now crystal. > > > About the Solr4 schema, it wasn't introduced as a Solr4 compatible version > > of the default schema.xml file and i think it should be removed in favour > > of updating the schema.xml to Solr4.The only change i can think of is > > adding the version field that is mandatory for SolrCloud. The schema > > version is 1.5 which the default schema already has. > > > > OK so what about all of the addition config in the schema-solr4.xml > file which resides above the actual field definitions? E.g. the > tokenisation, etc. parts you discuss above > I also think it is too ambiguous (and slightly pointless) to maintain > two schema (unless of course someone can provide justification). I > think (in all distributions moving forward) we should aim to simplify > this and encapsulate all required field and configuration definitions > in a single schema.xml... > > Lewis >