Hello!

I've tried everything and set everything up and get the nutch headings
plugin working:

nutch-site.xml

<property>protocol-okhttp
  <name>

<value>protocol-okhttp|...|parse-(html|tika|text|metatags)|index-(basic|anchor|more|metadata)|...|headings|nutch-extensionpoints</value>
</property>

schema.xml


<!-- fields for the headings plugin -->
<field name="h1" type="text_general" stored="true" indexed="true"
multiValued="true"/>
<field name="h2" type="text_general" stored="true" indexed="true"
multiValued="true"/>
<field name="h3" type="text_general" stored="true" indexed="true"
multiValued="true"/>
<field name="h4" type="text_general" stored="true" indexed="true"
multiValued="true"/>
<field name="h5" type="text_general" stored="true" indexed="true"
multiValued="true"/>
<field name="h6" type="text_general" stored="true" indexed="true"
multiValued="true"/>

index-writers.xml
  <mapping>
      <rename>
        <field source="metatag.h1" dest="h1"/>
        <field source="metatag.h2" dest="h2"/>
        <field source="metatag.h3" dest="h3"/>
        <field source="metatag.h4" dest="h4"/>
        <field source="metatag.h5" dest="h5"/>
        <field source="metatag.h6" dest="h6"/>
      </rename>
...

After indexing to solr there are no HTML headings tags in my solr index,
what's missing?

thanks!

Reply via email to