Andrew Wang created HADOOP-14501: ------------------------------------ Summary: aalto-xml cannot handle some odd XML features Key: HADOOP-14501 URL: https://issues.apache.org/jira/browse/HADOOP-14501 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.9.0, 3.0.0-alpha4 Reporter: Andrew Wang Priority: Blocker
[~hgadre] tried testing solr with a Hadoop 3 client. He saw various test case failures due to what look like functionality gaps in the new aalto-xml stax implementation pulled in by HADOOP-14216: {noformat} [junit4] > Throwable #1: com.fasterxml.aalto.WFCException: Illegal XML character ('ΓΌ' (code 252)) .... [junit4] > Caused by: com.fasterxml.aalto.WFCException: General entity reference (&bar;) encountered in entity expanding mode: operation not (yet) implemented ... [junit4] > Throwable #1: org.apache.solr.common.SolrException: General entity reference (&wacky;) encountered in entity expanding mode: operation not (yet) implemented {noformat} These were from the following test case executions: {noformat} NOTE: reproduce with: ant test -Dtestcase=DocumentAnalysisRequestHandlerTest -Dtests.method=testCharsetOutsideDocument -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=und -Dtests.timezone=Atlantic/Faeroe -Dtests.asserts=true -Dtests.file.encoding=US-ASCII NOTE: reproduce with: ant test -Dtestcase=MBeansHandlerTest -Dtests.method=testXMLDiffWithExternalEntity -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=en-US -Dtests.timezone=US/Aleutian -Dtests.asserts=true -Dtests.file.encoding=US-ASCII NOTE: reproduce with: ant test -Dtestcase=XmlUpdateRequestHandlerTest -Dtests.method=testExternalEntities -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=hr -Dtests.timezone=America/Barbados -Dtests.asserts=true -Dtests.file.encoding=US-ASCII NOTE: reproduce with: ant test -Dtestcase=XmlUpdateRequestHandlerTest -Dtests.method=testNamedEntity -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=hr -Dtests.timezone=America/Barbados -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org