[ https://issues.apache.org/jira/browse/ANY23-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334089#comment-15334089 ]
Lewis John McGibbney commented on ANY23-291: -------------------------------------------- Hi Thomas this is a great start. If you are interested in learning how to co tonite to Any23 via code then we are more than happy to help get you on your way. None-the-less, thank you very much for the test file. I'll have a look at augmenting the test case such that we extract Enbedded JSON-LD outside of <head> -- *Lewis* > JSON-LD should be looked up in entire HTML document, not just in <head> > ----------------------------------------------------------------------- > > Key: ANY23-291 > URL: https://issues.apache.org/jira/browse/ANY23-291 > Project: Apache Any23 > Issue Type: Improvement > Components: extractors > Affects Versions: 1.2 > Reporter: Thomas Francart > Priority: Minor > Fix For: 1.3 > > Attachments: example-embedded-jsonld.html > > > In > org.apache.any23.extractor.html.EmbeddedJSONLDExtractor.extractJSONLDScript(), > I think this line : > List<Node> scriptNodes = DomUtils.findAll(in, "/HTML/HEAD/SCRIPT"); > is too restrictive. scripts containing json-ld can be placed anywhere in the > page, and actually some CMS/Wordpress plugin inserting JSON-LD are generating > their output in the body, not in the head. -- This message was sent by Atlassian JIRA (v6.3.4#6332)