[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-25 Thread Marshall Schor (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872652#comment-16872652
 ] 

Marshall Schor commented on UIMA-6064:
--

OK, I'll do this a piece at a time.  For this Jira, we'll add enablement of the 
Doctype declaration (only).

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-19 Thread Timo Boehme (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867304#comment-16867304
 ] 

Timo Boehme commented on UIMA-6064:
---

DISALLOW_DOCTYPE_DECL is the most important one as otherwise using a DTD with 
includes etc. in descriptors is not possible. Validating against a schema seems 
not to be needed as the XML parser itself does some checks. 

Not sure about the TransformerFactory and 
ACCESS_EXTERNAL_DTD/ACCESS_EXTERNAL_STYLESHEET. At least the possibility to 
allow for file access could be of interest.

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-17 Thread Marshall Schor (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865896#comment-16865896
 ] 

Marshall Schor commented on UIMA-6064:
--

Not sure what the range of reasonable use-cases are, here. 

Is just a switch/flag for DISALLOW_DOCTYPE_DECL all that's needed in your use 
case?

If so, I think I would rather implement that (only) until we get more evidence 
of what other use cases are needed.

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-17 Thread Marshall Schor (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865721#comment-16865721
 ] 

Marshall Schor commented on UIMA-6064:
--

Thanks; good catch on ACCESS_EXTERNAL_SCHEMA - we'll add that.

It seems reasonable to add 1 switch to restrict all/ none (default restrict 
all).

I'll work on coming up with a spec for the true/false and "protocols" or all or 
"".

I'm thinking of making all of these -D kind of parameters, because that allows 
env vars (as the value of the -D parameter) as an alternative. 

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-17 Thread Timo Boehme (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865449#comment-16865449
 ] 

Timo Boehme commented on UIMA-6064:
---

I think the problem starts with the possibility to categorize the features by 
at least 3 orthogonal dimensions:
 # defining API: SAX, Java (JAXP), Apache
 # target: DTD, Schema, Stylesheet
 # value type: boolean vs. allowed schemes (http, file, ...)

(PS: why was {{javax.xml.XMLConstants.ACCESS_EXTERNAL_SCHEMA}} not specified, 
see [https://docs.oracle.com/javase/tutorial/jaxp/properties/properties.html)]

The current Apache Xerces does not support the new JAXP features (only the Java 
bundled does) so getting a warning each time.

Thus it might be hard to find good abstract settings to capture the different 
uses - e.g. if JAXP is correctly setup outside there would be no need to do it 
here again. Maybe this could be a possibility:
 * have one switch for restricting all/restrict none
 * have an environment variable for XML features to set (comma separated), e.g. 
feature1:true,feature2:false,...
 * have an environment variable for XML properties to set (comma separated), 
e.g. property1:all,property2:file,...

For all features/properties defined via variable use this value instead of the 
hard-coded one.

 

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-13 Thread Marshall Schor (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863490#comment-16863490
 ] 

Marshall Schor commented on UIMA-6064:
--

There are 6 xml / sax feature strings in XMLUtils.  Should one switch/flag be 
use to disable/endable all of these?  Or just some?  or do we need multiple 
flags?  Thanks for clarifying.

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-13 Thread Timo Boehme (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862818#comment-16862818
 ] 

Timo Boehme commented on UIMA-6064:
---

Hi, I've checked it and actually it is the other way around: the 
DISALLOW_DOCTYPE_DECL is important to allow any DTD declaration (internal or 
external) - otherwise any  will produce an error. The 
LOAD_EXTERNAL_DTD 
([http://apache.org/xml/features/nonvalidating/load-external-dtd)] is only 
recognized if not validating (if 'true' it will parse (but not 'use') the DTD 
and report contained errors). If validation is set to 'true' (e.g. 
'mSchemaValidationEnabled' is 'true') it sets feature 
'[http://xml.org/sax/features/validation'] to 'true' which overwrites setting 
of LOAD_EXTERNAL_DTD.

To summarize: a switch/flag for DISALLOW_DOCTYPE_DECL is required.

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade

2019-06-12 Thread Marshall Schor (JIRA)


[ 
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862384#comment-16862384
 ] 

Marshall Schor commented on UIMA-6064:
--

Hi, would putting the disable of the LOAD_EXTERNAL_DTD (only) under the control 
of a JVM - D uima. kind of argument be suitable for your use case?  Or do 
you need also the other flag?

> External DTD usage in XML descriptors disabled during build revision upgrade
> 
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
>  Issue Type: Bug
>  Components: Core Java Framework
>Affects Versions: 2.10.2SDK
>Reporter: Timo Boehme
>Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed 
> (fixed, without the possibility to adjust it) to not allow for DTD and its 
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the 
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the 
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing 
> nasty things, the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, 
> method call etc. - the only workaround is to do a problematic sub-classing of 
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in 
> modular chunks using entities etc. Thus it is important (for the time being) 
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such 
> changes should not occur in a build upgrade or it should at least be possible 
> to get the old behavior easily back.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)