[jira] [Created] (ANY23-281) Build Policeman's Forbidden API Checker into Maven config

2016-04-02 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created ANY23-281:
--

 Summary: Build Policeman's Forbidden API Checker into Maven config
 Key: ANY23-281
 URL: https://issues.apache.org/jira/browse/ANY23-281
 Project: Apache Any23
  Issue Type: Improvement
  Components: build
Reporter: Lewis John McGibbney
 Fix For: 1.3


The [forbidden API checker|https://github.com/policeman-tools/forbidden-apis] 
is a good tool which we can use to improve the build. Any23 would benefit from 
better checks in the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ANY23-247) FIX Attribute name "itemscope" associated with an element type "html" must be followed by the ' = ' character.

2016-04-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ANY23-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223033#comment-15223033
 ] 

Hudson commented on ANY23-247:
--

UNSTABLE: Integrated in Any23-trunk #1468 (See 
[https://builds.apache.org/job/Any23-trunk/1468/])
ANY23-247 FIX Attribute name itemscope associated with an element type 
(lewis.j.mcgibbney: rev fc4593272a2e331ac5abfbe8ef1c46713a2b6f7f)
* core/src/test/java/org/apache/any23/validator/DefaultValidatorTest.java
* core/src/test/java/org/apache/any23/Any23Test.java
* core/src/main/java/org/apache/any23/extractor/SingleDocumentExtraction.java
* 
test-resources/src/test/resources/org/apache/any23/validator/microdata-basic.html
* core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java
* 
core/src/main/java/org/apache/any23/validator/rule/MissingOpenGraphNamespaceRule.java
* core/src/main/java/org/apache/any23/validator/rule/OpenGraphNamespaceFix.java
* src/site/apt/index.apt
* core/src/main/java/org/apache/any23/validator/DefaultValidator.java
* core/src/main/java/org/apache/any23/validator/rule/MetaNameMisuseFix.java
* 
core/src/main/java/org/apache/any23/validator/rule/MissingItemscopeAttributeValueRule.java
* core/src/test/resources/log4j.properties
* 
core/src/main/java/org/apache/any23/validator/rule/MissingItemscopeAttributeValueFix.java
* core/src/main/java/org/apache/any23/validator/rule/MetaNameMisuseRule.java


> FIX Attribute name "itemscope" associated with an element type "html" must be 
> followed by the ' = ' character.
> --
>
> Key: ANY23-247
> URL: https://issues.apache.org/jira/browse/ANY23-247
> Project: Apache Any23
>  Issue Type: Improvement
>Affects Versions: 1.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.2
>
>
> In the following markup
> {code}
>  "http://www.w3.org/TR/html4/loose.dtd";>
> http://www.w3.org/1999/xhtml"; 
> xmlns:og="http://opengraphprotocol.org/schema/"; 
> xmlns:fb="http://www.facebook.com/2008/fbml"; version="HTML+RDFa 1.0" 
> xml:lang="en" itemscope itemtype="http://schema.org/Product";>
> 
> 
> 
> 
> ...
> {code}
> Due to the absence of any subsequent value for *itemscope*, we get the 
> following error in our web server logs
> {code}
> [Fatal Error] :2:185: Attribute name "itemscope" associated with an element 
> type "html" must be followed by the ' = ' character.
> {code}
> Although the markup semantics are incorrect, Any23 should simply perform a 
> check for the itemscope value being null, if this is the case then add *=""*, 
> there is a precedent for us doing something like this before, I just cant 
> find the ticket right now!
> The code we need to add is present within either 
> core/src/main/java/org/apache/any23/extractor/microdata/ItemScope.java
> core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is still unstable: Any23-trunk #1468

2016-04-02 Thread Apache Jenkins Server
See 



[jira] [Updated] (ANY23-280) Refactor ContentExtractor to improve extraction flexibility

2016-04-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ANY23-280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated ANY23-280:
---
Summary: Refactor ContentExtractor to improve extraction flexibility  (was: 
Restructure ContentExtractor to improve extraction flexibility)

> Refactor ContentExtractor to improve extraction flexibility
> ---
>
> Key: ANY23-280
> URL: https://issues.apache.org/jira/browse/ANY23-280
> Project: Apache Any23
>  Issue Type: Improvement
>  Components: core, extractors
>Affects Versions: 1.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 1.2
>
>
> As discussed on ANY23-247, the 
> [ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
>  is simply not fit for purpose. This issue was discovered and the cause has 
> plagued our builds ever since. Any extractors which implement 
> [BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
>  are based on the Extractor.ContentExtractor and hence work off of an 
> 'unfixed' raw data stream as oppose to a more flexible model such as the 
> [TagSoupDOMExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
> This issue should restructure RDF extractors to enable more flexibility and 
> to avoid issues we encounter with the strict SAX parsing logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ANY23-280) Refactor ContentExtractor to improve extraction flexibility

2016-04-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ANY23-280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated ANY23-280:
---
Description: 
As discussed on ANY23-247, the 
[ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
 is simply not fit for purpose. This issue was discovered and the cause has 
plagued our builds ever since. Any extractors which implement 
[BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
 are based on the Extractor.ContentExtractor and hence work off of an 'unfixed' 
raw data stream as oppose to a more flexible model such as the 
[TagSoupDOMExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
This issue should refactor RDF extractors to enable more flexibility and to 
avoid issues we encounter with the strict SAX parsing logic.

  was:
As discussed on ANY23-247, the 
[ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
 is simply not fit for purpose. This issue was discovered and the cause has 
plagued our builds ever since. Any extractors which implement 
[BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
 are based on the Extractor.ContentExtractor and hence work off of an 'unfixed' 
raw data stream as oppose to a more flexible model such as the 
[TagSoupDOMExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
This issue should restructure RDF extractors to enable more flexibility and to 
avoid issues we encounter with the strict SAX parsing logic.


> Refactor ContentExtractor to improve extraction flexibility
> ---
>
> Key: ANY23-280
> URL: https://issues.apache.org/jira/browse/ANY23-280
> Project: Apache Any23
>  Issue Type: Improvement
>  Components: core, extractors
>Affects Versions: 1.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 1.2
>
>
> As discussed on ANY23-247, the 
> [ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
>  is simply not fit for purpose. This issue was discovered and the cause has 
> plagued our builds ever since. Any extractors which implement 
> [BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
>  are based on the Extractor.ContentExtractor and hence work off of an 
> 'unfixed' raw data stream as oppose to a more flexible model such as the 
> [TagSoupDOMExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
> This issue should refactor RDF extractors to enable more flexibility and to 
> avoid issues we encounter with the strict SAX parsing logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ANY23-280) Restructure ContentExtractor to improve extraction flexibility

2016-04-02 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created ANY23-280:
--

 Summary: Restructure ContentExtractor to improve extraction 
flexibility
 Key: ANY23-280
 URL: https://issues.apache.org/jira/browse/ANY23-280
 Project: Apache Any23
  Issue Type: Improvement
  Components: core, extractors
Affects Versions: 1.1
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Critical
 Fix For: 1.2


As discussed on ANY23-247, the 
[ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
 is simply not fit for purpose. This issue was discovered and the cause has 
plagued our builds ever since. Any extractors which implement 
[BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
 are based on the Extractor.ContentExtractor and hence work off of an 'unfixed' 
raw data stream as oppose to a more flexible model such as the 
[TagSoupDOMExtractorhttps://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
This issue should restructure RDF extractors to enable more flexibility and to 
avoid issues we encounter with the strict SAX parsing logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ANY23-280) Restructure ContentExtractor to improve extraction flexibility

2016-04-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ANY23-280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated ANY23-280:
---
Description: 
As discussed on ANY23-247, the 
[ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
 is simply not fit for purpose. This issue was discovered and the cause has 
plagued our builds ever since. Any extractors which implement 
[BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
 are based on the Extractor.ContentExtractor and hence work off of an 'unfixed' 
raw data stream as oppose to a more flexible model such as the 
[TagSoupDOMExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
This issue should restructure RDF extractors to enable more flexibility and to 
avoid issues we encounter with the strict SAX parsing logic.

  was:
As discussed on ANY23-247, the 
[ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
 is simply not fit for purpose. This issue was discovered and the cause has 
plagued our builds ever since. Any extractors which implement 
[BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
 are based on the Extractor.ContentExtractor and hence work off of an 'unfixed' 
raw data stream as oppose to a more flexible model such as the 
[TagSoupDOMExtractorhttps://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
This issue should restructure RDF extractors to enable more flexibility and to 
avoid issues we encounter with the strict SAX parsing logic.


> Restructure ContentExtractor to improve extraction flexibility
> --
>
> Key: ANY23-280
> URL: https://issues.apache.org/jira/browse/ANY23-280
> Project: Apache Any23
>  Issue Type: Improvement
>  Components: core, extractors
>Affects Versions: 1.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 1.2
>
>
> As discussed on ANY23-247, the 
> [ContentExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L44]
>  is simply not fit for purpose. This issue was discovered and the cause has 
> plagued our builds ever since. Any extractors which implement 
> [BaseRDFExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java]
>  are based on the Extractor.ContentExtractor and hence work off of an 
> 'unfixed' raw data stream as oppose to a more flexible model such as the 
> [TagSoupDOMExtractor|https://github.com/apache/any23/blob/63ba2fc82966cc056a2e475af849154d0dfdcf93/api/src/main/java/org/apache/any23/extractor/Extractor.java#L60].
> This issue should restructure RDF extractors to enable more flexibility and 
> to avoid issues we encounter with the strict SAX parsing logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ANY23-247) FIX Attribute name "itemscope" associated with an element type "html" must be followed by the ' = ' character.

2016-04-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ANY23-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved ANY23-247.

Resolution: Fixed

> FIX Attribute name "itemscope" associated with an element type "html" must be 
> followed by the ' = ' character.
> --
>
> Key: ANY23-247
> URL: https://issues.apache.org/jira/browse/ANY23-247
> Project: Apache Any23
>  Issue Type: Improvement
>Affects Versions: 1.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.2
>
>
> In the following markup
> {code}
>  "http://www.w3.org/TR/html4/loose.dtd";>
> http://www.w3.org/1999/xhtml"; 
> xmlns:og="http://opengraphprotocol.org/schema/"; 
> xmlns:fb="http://www.facebook.com/2008/fbml"; version="HTML+RDFa 1.0" 
> xml:lang="en" itemscope itemtype="http://schema.org/Product";>
> 
> 
> 
> 
> ...
> {code}
> Due to the absence of any subsequent value for *itemscope*, we get the 
> following error in our web server logs
> {code}
> [Fatal Error] :2:185: Attribute name "itemscope" associated with an element 
> type "html" must be followed by the ' = ' character.
> {code}
> Although the markup semantics are incorrect, Any23 should simply perform a 
> check for the itemscope value being null, if this is the case then add *=""*, 
> there is a precedent for us doing something like this before, I just cant 
> find the ticket right now!
> The code we need to add is present within either 
> core/src/main/java/org/apache/any23/extractor/microdata/ItemScope.java
> core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ANY23-247) FIX Attribute name "itemscope" associated with an element type "html" must be followed by the ' = ' character.

2016-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ANY23-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223021#comment-15223021
 ] 

ASF GitHub Bot commented on ANY23-247:
--

Github user asfgit closed the pull request at:

https://github.com/apache/any23/pull/17


> FIX Attribute name "itemscope" associated with an element type "html" must be 
> followed by the ' = ' character.
> --
>
> Key: ANY23-247
> URL: https://issues.apache.org/jira/browse/ANY23-247
> Project: Apache Any23
>  Issue Type: Improvement
>Affects Versions: 1.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.2
>
>
> In the following markup
> {code}
>  "http://www.w3.org/TR/html4/loose.dtd";>
> http://www.w3.org/1999/xhtml"; 
> xmlns:og="http://opengraphprotocol.org/schema/"; 
> xmlns:fb="http://www.facebook.com/2008/fbml"; version="HTML+RDFa 1.0" 
> xml:lang="en" itemscope itemtype="http://schema.org/Product";>
> 
> 
> 
> 
> ...
> {code}
> Due to the absence of any subsequent value for *itemscope*, we get the 
> following error in our web server logs
> {code}
> [Fatal Error] :2:185: Attribute name "itemscope" associated with an element 
> type "html" must be followed by the ' = ' character.
> {code}
> Although the markup semantics are incorrect, Any23 should simply perform a 
> check for the itemscope value being null, if this is the case then add *=""*, 
> there is a precedent for us doing something like this before, I just cant 
> find the ticket right now!
> The code we need to add is present within either 
> core/src/main/java/org/apache/any23/extractor/microdata/ItemScope.java
> core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] any23 pull request: ANY23-247 FIX Attribute name itemscope associa...

2016-04-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/any23/pull/17


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---