[ https://issues.apache.org/jira/browse/ANY23-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486412#comment-14486412 ]
Peter Ansell commented on ANY23-247: ------------------------------------ I think the only place they are defined right now is in DefaultValidator.loadDefaultRules, and the only place they are applied is in DefaultValidator.validate. You may need to create an instance of Rule to match documents that have 'itemscope' and then use the Fix implementation that you have written already to patch them with 'itemscope="itemscope"'. You pair the Rule with the Fix in DefaultValidator.loadDefaultRules Ideally we would have a FixFactory interface that is implemented for each combination of a Rule with an optional Fix. The FixFactory can then be registered as a service using META-INF/services, to avoid having them hardcoded into DefaultValidator.loadDefaultRules. > FIX Attribute name "itemscope" associated with an element type "html" must be > followed by the ' = ' character. > -------------------------------------------------------------------------------------------------------------- > > Key: ANY23-247 > URL: https://issues.apache.org/jira/browse/ANY23-247 > Project: Apache Any23 > Issue Type: Improvement > Affects Versions: 1.1 > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Fix For: 1.3 > > > In the following markup > {code} > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" > "http://www.w3.org/TR/html4/loose.dtd"> > <html xmlns="http://www.w3.org/1999/xhtml" > xmlns:og="http://opengraphprotocol.org/schema/" > xmlns:fb="http://www.facebook.com/2008/fbml" version="HTML+RDFa 1.0" > xml:lang="en" itemscope itemtype="http://schema.org/Product"> > <head> > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> > <meta http-equiv="X-UA-Compatible" content="IE=edge" /> > <meta name="generator" content="ToolTwist" /> > ... > {code} > Due to the absence of any subsequent value for *itemscope*, we get the > following error in our web server logs > {code} > [Fatal Error] :2:185: Attribute name "itemscope" associated with an element > type "html" must be followed by the ' = ' character. > {code} > Although the markup semantics are incorrect, Any23 should simply perform a > check for the itemscope value being null, if this is the case then add *=""*, > there is a precedent for us doing something like this before, I just cant > find the ticket right now! > The code we need to add is present within either > core/src/main/java/org/apache/any23/extractor/microdata/ItemScope.java > core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)