Thanks Markus, I'm not sure where the error was but I reinstalled Nutch and
it works with your setup.

Am Mo., 31. Okt. 2022 um 14:36 Uhr schrieb Markus Jelsma <
markus.jel...@openindex.io>:

> Hmmm, using a clean current Nutch i can get it to work with:
> <configuration>
>  <property>
>    <name>http.agent.name</name>
>    <value>NutchTest</value>
>  </property>
>  <property>
>    <name>index.parse.md</name>
>    <value>h1,h2</value>
>  </property>
>  <property>
>    <name>plugin.includes</name>
>    <value>headings|protocol-http|parse-tika|index-metadata</value>
>  </property>
> </configuration>
>
> $ bin/nutch indexchecker https://nutch.apache.org/
> digest :        13584e71e6e09a71071936feb97892b8
> h1 :    Apache Nutchâ„¢
> id :    https://nutch.apache.org/
>
> Can you check you configuration? Is a plugin name mispelled? Is the
> headings plugin active during fetch/parse? Is the index-metadata plugin
> active?
>
> Regards,
> Markus
>
>
> Op ma 31 okt. 2022 om 14:14 schreef Mike <mz579...@gmail.com>:
>
> > Hello Markus!
> >
> > Thank you for taking care of my problem!
> >
> > I removed the metatag.h# fron index.parse.md but ntuch indexchecker do
> not
> > show me still the fields.
> >
> > Am Mo., 31. Okt. 2022 um 12:56 Uhr schrieb Markus Jelsma <
> > markus.jel...@openindex.io>:
> >
> > > Hello Mike,
> > >
> > > Please remove the metatag.* prefix in the index.parse.md config and i
> > > think
> > > you should be fine.
> > >
> > > Regards,
> > > Markus
> > >
> > > Op ma 31 okt. 2022 om 12:32 schreef Mike <mz579...@gmail.com>:
> > >
> > > > Yes, sorry, I also forgot to post this setting:
> > > >
> > > > <property>
> > > >    <name>index.parse.md</name>
> > > >
> > > >
> > > >
> > >
> >
> <value>metatag.description,metatag.keywords,metatag.rating,metatag.h1,metatag.h2,metatag.h3,metatag.h4,metatag.h5,metatag.h6</value>
> > > >    <description>
> > > >    Comma-separated list of keys to be taken from the parse metadata
> to
> > > > generate fields.
> > > >    Can be used e.g. for 'description' or 'keywords' provided that
> these
> > > > values are generated
> > > >    by a parser (see parse-metatags plugin)
> > > >    </description>
> > > > </property>
> > > >
> > > > The Nutch parsechecker shows me the fields but the indexchecker
> > doesn't.
> > > >
> > > > Am Mo., 31. Okt. 2022 um 04:51 Uhr schrieb Mike <mz579...@gmail.com
> >:
> > > >
> > > > > Hello!
> > > > >
> > > > > I've tried everything and set everything up and get the nutch
> > headings
> > > > > plugin working:
> > > > >
> > > > > nutch-site.xml
> > > > >
> > > > > <property>protocol-okhttp
> > > > >   <name>
> > > > >
> > > > >
> > > >
> > >
> >
> <value>protocol-okhttp|...|parse-(html|tika|text|metatags)|index-(basic|anchor|more|metadata)|...|headings|nutch-extensionpoints</value>
> > > > > </property>
> > > > >
> > > > > schema.xml
> > > > >
> > > > >
> > > > > <!-- fields for the headings plugin -->
> > > > > <field name="h1" type="text_general" stored="true" indexed="true"
> > > > > multiValued="true"/>
> > > > > <field name="h2" type="text_general" stored="true" indexed="true"
> > > > > multiValued="true"/>
> > > > > <field name="h3" type="text_general" stored="true" indexed="true"
> > > > > multiValued="true"/>
> > > > > <field name="h4" type="text_general" stored="true" indexed="true"
> > > > > multiValued="true"/>
> > > > > <field name="h5" type="text_general" stored="true" indexed="true"
> > > > > multiValued="true"/>
> > > > > <field name="h6" type="text_general" stored="true" indexed="true"
> > > > > multiValued="true"/>
> > > > >
> > > > > index-writers.xml
> > > > >   <mapping>
> > > > >       <rename>
> > > > >         <field source="metatag.h1" dest="h1"/>
> > > > >         <field source="metatag.h2" dest="h2"/>
> > > > >         <field source="metatag.h3" dest="h3"/>
> > > > >         <field source="metatag.h4" dest="h4"/>
> > > > >         <field source="metatag.h5" dest="h5"/>
> > > > >         <field source="metatag.h6" dest="h6"/>
> > > > >       </rename>
> > > > > ...
> > > > >
> > > > > After indexing to solr there are no HTML headings tags in my solr
> > > index,
> > > > > what's missing?
> > > > >
> > > > > thanks!
> > > > >
> > > >
> > >
> >
>

Reply via email to