Hello. Could you tell me the difference between this two? 1) Having a DIH with a field in data-import-config.xml like this: <field column="body" name="article" stripHTML="true"/> b) Having the Schema.xml with a field like this: <fieldType name="textNoHtml" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <charFilter class="solr.HTMLStripCharFilterFactory"/> </analyzer> </fieldType> <field name="article" type="textNoHtml" indexed="true" stored="true" /> I assume when I call to the DIH, it first removes the HMTL, and then, when indexing, the HTML should be removed again, but the HTML was already removed by stripHML in data-import-config. Si, doesn it make sense to declare a field as stripHTML=true when than field will be stored in a field with a HTMLStripCharFilterFactory? Thanks for you help.
|
- StripHTML and HTMLStripCharFilterFactory Sergio Martín Cantero
- Re: StripHTML and HTMLStripCharFilterFactory Jack Krupansky