Andrew Wang created HADOOP-14501:
------------------------------------

             Summary: aalto-xml cannot handle some odd XML features
                 Key: HADOOP-14501
                 URL: https://issues.apache.org/jira/browse/HADOOP-14501
             Project: Hadoop Common
          Issue Type: Bug
          Components: conf
    Affects Versions: 2.9.0, 3.0.0-alpha4
            Reporter: Andrew Wang
            Priority: Blocker


[~hgadre] tried testing solr with a Hadoop 3 client. He saw various test case 
failures due to what look like functionality gaps in the new aalto-xml stax 
implementation pulled in by HADOOP-14216:

{noformat}
   [junit4]    > Throwable #1: com.fasterxml.aalto.WFCException: Illegal XML 
character ('ΓΌ' (code 252))
....
   [junit4]    > Caused by: com.fasterxml.aalto.WFCException: General entity 
reference (&bar;) encountered in entity expanding mode: operation not (yet) 
implemented
...
   [junit4]    > Throwable #1: org.apache.solr.common.SolrException: General 
entity reference (&wacky;) encountered in entity expanding mode: operation not 
(yet) implemented
{noformat}

These were from the following test case executions:

{noformat}
NOTE: reproduce with: ant test  -Dtestcase=DocumentAnalysisRequestHandlerTest 
-Dtests.method=testCharsetOutsideDocument -Dtests.seed=2F739D88D9C723CA 
-Dtests.slow=true -Dtests.locale=und -Dtests.timezone=Atlantic/Faeroe 
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII
NOTE: reproduce with: ant test  -Dtestcase=MBeansHandlerTest 
-Dtests.method=testXMLDiffWithExternalEntity -Dtests.seed=2F739D88D9C723CA 
-Dtests.slow=true -Dtests.locale=en-US -Dtests.timezone=US/Aleutian 
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII
NOTE: reproduce with: ant test  -Dtestcase=XmlUpdateRequestHandlerTest 
-Dtests.method=testExternalEntities -Dtests.seed=2F739D88D9C723CA 
-Dtests.slow=true -Dtests.locale=hr -Dtests.timezone=America/Barbados 
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII
NOTE: reproduce with: ant test  -Dtestcase=XmlUpdateRequestHandlerTest 
-Dtests.method=testNamedEntity -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true 
-Dtests.locale=hr -Dtests.timezone=America/Barbados -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to