[ https://issues.apache.org/jira/browse/CONNECTORS-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Wright resolved CONNECTORS-1550. ------------------------------------- Resolution: Not A Problem Hi [~DonaldVdD], please post questions like this to the us...@manifoldcf.apache.org mailing list. Jira is meant for bugs and enhancement requests. Thank you! > HTML Tag mapping > ---------------- > > Key: CONNECTORS-1550 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1550 > Project: ManifoldCF > Issue Type: Wish > Components: Elastic Search connector, Tika extractor, Web connector > Affects Versions: ManifoldCF 2.10 > Reporter: Donald Van den Driessche > Priority: Major > > I’ll be crawling a website with the standard Web connecter. I want to extract > just certain html tags like <h1>, <h2> and <p>. > I’ve set up an HTML extractor transformation connector and the internal Tika > transformation connector. But I can’t find any place to do a mapping to the > output for this. > > Do I have to write my own transformation connector to extract the content of > these tags? Or is there a built in solution? -- This message was sent by Atlassian JIRA (v7.6.3#76005)