Hello Mike, I think it should be working just fine with it enabled in protocol.includes. You can check Nutch' parser output by using: $ bin/nutch parsechecker <URL>
You should see one or more h# output fields present. You can then use the index-metadata plugin to map the parser output fields to the indexer output by setting the values for index.parse.md. Regards, Markus Op ma 31 okt. 2022 om 04:51 schreef Mike <mz579...@gmail.com>: > Hello! > > I've tried everything and set everything up and get the nutch headings > plugin working: > > nutch-site.xml > > <property>protocol-okhttp > <name> > > > <value>protocol-okhttp|...|parse-(html|tika|text|metatags)|index-(basic|anchor|more|metadata)|...|headings|nutch-extensionpoints</value> > </property> > > schema.xml > > > <!-- fields for the headings plugin --> > <field name="h1" type="text_general" stored="true" indexed="true" > multiValued="true"/> > <field name="h2" type="text_general" stored="true" indexed="true" > multiValued="true"/> > <field name="h3" type="text_general" stored="true" indexed="true" > multiValued="true"/> > <field name="h4" type="text_general" stored="true" indexed="true" > multiValued="true"/> > <field name="h5" type="text_general" stored="true" indexed="true" > multiValued="true"/> > <field name="h6" type="text_general" stored="true" indexed="true" > multiValued="true"/> > > index-writers.xml > <mapping> > <rename> > <field source="metatag.h1" dest="h1"/> > <field source="metatag.h2" dest="h2"/> > <field source="metatag.h3" dest="h3"/> > <field source="metatag.h4" dest="h4"/> > <field source="metatag.h5" dest="h5"/> > <field source="metatag.h6" dest="h6"/> > </rename> > ... > > After indexing to solr there are no HTML headings tags in my solr index, > what's missing? > > thanks! >