[ https://issues.apache.org/jira/browse/ANY23-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359621#comment-16359621 ]
ASF GitHub Bot commented on ANY23-291: -------------------------------------- Github user ferrerod commented on the issue: https://github.com/apache/any23/pull/60 Thanks Lewis. 1. I do see where extractJSONLDScript receives 'out' as a parameter, and within the for loop, passes out to another call to extractor.run(....,out), (as somewhat recursive use, except the extractor in this case is a new instance), however I am still not seeing 'out' being written to. I suspect it may not be necessary if the code is working as expected. I will have a deeper look at test files, etc. I see in > JSON-LD should be looked up in entire HTML document, not just in <head> > ----------------------------------------------------------------------- > > Key: ANY23-291 > URL: https://issues.apache.org/jira/browse/ANY23-291 > Project: Apache Any23 > Issue Type: Improvement > Components: extractors > Affects Versions: 1.2 > Reporter: Thomas Francart > Assignee: Hans Brende > Priority: Minor > Fix For: 2.2 > > Attachments: example-embedded-jsonld.html > > > In > org.apache.any23.extractor.html.EmbeddedJSONLDExtractor.extractJSONLDScript(), > I think this line : > List<Node> scriptNodes = DomUtils.findAll(in, "/HTML/HEAD/SCRIPT"); > is too restrictive. scripts containing json-ld can be placed anywhere in the > page, and actually some CMS/Wordpress plugin inserting JSON-LD are generating > their output in the body, not in the head. -- This message was sent by Atlassian JIRA (v7.6.3#76005)