Actually, I never sent an earlier mail, so my “earlier mail” statement is wrong (that is, I was wrong about being wrong here.
Anyway, the original sample doc was (is) valid and the injection can be done if you have access to the ML server’s file system and ML has read access to a directory you can write to and you can create and can run XQuery to load the file from the server’s file system. But except by disallowing entity reference resolution I don’t see how this is something ML itself can prevent. I think it’s an application and server security issue. I tried the same test but using an HTTP URL that is resolvable: <?xml version="1.0"?> <!DOCTYPE foo [ <!ELEMENT foo ANY > <!ENTITY xxe SYSTEM "https://github.com/dita-community/dita-ot-project-docker/raw/master/README.md" ]> <foo>&xxe;</foo> I verified that the URL is resolvable by using Oxygen’s open URL function to get the file. Using xdmp:load() ML reports: [1.0-ml] SVC-FILOPN: xdmp:eval("xquery version "1.0-ml"; let $source := xdmp:...", (), <options xmlns="xdmp:eval"><database>13776655024510127060</database>...</options>) -- File open error: open 'https://github.com/dita-community/dita-ot-project-docker/raw/master/README.md': No such file or directory So injection of HTTP URLs appears to not work in this case. So I think the injection case can only happen from ML-executed XQuery when reading files from the server’s file system. But in a normal server, file system access should be tightly controlled. Documents loaded using facilities outside ML, e.g., Java code that parses source XML to pass to XCC or something, it outside of ML’s ability to control and is thus an application issue, not an ML issue. It does not appear to be possible to do this injection using documents supplied via say an HTTP response as even if you could provide an XML document with a DOCTYPE declaration ML would not be able to resolve any entity references that were not to files in the local ML database. Unquote did not handle the entity reference—it clearly tried to resolve it before doing the unquoting, resulting in an “illegal entity reference” message and escaping the “&” resulted in a single “;” where the entity reference was (meaning “&xxe;” was not converted to “&xxe;” and then resolved--not actually sure what the unquote processor is doing in that case). Cheers, E. -- Eliot Kimber http://contrext.com From: <general-boun...@developer.marklogic.com> on behalf of Eliot Kimber <ekim...@contrext.com> Reply-To: MarkLogic Developer Discussion <general@developer.marklogic.com> Date: Wednesday, March 14, 2018 at 2:49 PM To: MarkLogic Developer Discussion <general@developer.marklogic.com> Subject: Re: [MarkLogic Dev General] Marklogic XXE and XML Bomb prevention My earlier mail was wrong. I was able to replicate the behavior in ML 9 using query console. Here is my source doc: <?xml version="1.0"?> <!DOCTYPE foo [ <!ELEMENT foo ANY > <!ENTITY xxe SYSTEM “/ekimber/test/test.xml" >]> <foo>&xxe;</foo> And test.xml: --- This is a text file injected. --- Here’s my query: xquery version "1.0-ml"; let $source := xdmp:load(“/ekimber/test/injection-test.xml") let $result := doc(“/ekimber/test/injection-test.xml") return $result Result: <?xml version="1.0" encoding="UTF-8"?> <foo>This is a text file injected. </foo> So ML’s built-in parser will resolve entity references to files on the file system when loaded from the file system using xdmp:load(). But I’m not sure this is something ML can or should prevent. I think the security presumption is that if you have access to the server (the machine running ML) and rights to run xquery on ML that you can do anything. I think it is up to a MarkLogic application to impose rules on what documents are allowed to be loaded and what the constraints on entity resolution are. Cheers, Eliot -- Eliot Kimber http://contrext.com From: <general-boun...@developer.marklogic.com> on behalf of Keith Breinholt <breinhol...@ldschurch.org> Reply-To: MarkLogic Developer Discussion <general@developer.marklogic.com> Date: Wednesday, March 14, 2018 at 12:07 PM To: MarkLogic Developer Discussion <general@developer.marklogic.com> Subject: Re: [MarkLogic Dev General] Marklogic XXE and XML Bomb prevention Perhaps you could show the code that you used to insert the document into the database. I, personally, cannot get your code to work for a number of reasons. 1) having both an xml processing statement and an HTML doctype is invalid. 2) Trying to assign the “document” to a variable throws an error because of #1. 3) If I try to put the “document” below into a file on the file system and load it I cannot use xdmp:document-insert() to insert the “document” into the database because there isn’t a valid node. There may be something I have overlooked so please share the code you used to insert this document into a database. -Keith From: general-boun...@developer.marklogic.com <general-boun...@developer.marklogic.com> On Behalf Of Marcel de Kleine Sent: Wednesday, March 14, 2018 6:43 AM To: general@developer.marklogic.com Subject: [MarkLogic Dev General] Marklogic XXE and XML Bomb prevention Hello, We have noticed Marklogic is vulnerable to xxe (entity expansion) and xml bomb attacks. When loading an malicious document using xdmp:document-insert it won’t catch these and cause either loading of unwanted external documents (xxe) and lockup of the system (xml bomb). For example, if I load this document : <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ELEMENT foo ANY > <!ENTITY xxe SYSTEM "file:///c:/text.xml" >]> <foo>&xxe;</foo> The file test.xml gets nicely added to the xml document. See OWASP and others for examples. This is clearly a xml processing issue so the question is : can we disable this? And if so, on what levels would this be possible. Best should be system-wide. ( And if you cannot disable this, I think this is something ML should address immediately. Thank you in advance, Marcel de Kleine, EPAM Marcel de Kleine Senior Software Engineer Office: +31 20 241 6134 x 30530 Cell: +31 6 14806016 Email: marcel_de_kle...@epam.com Delft, Netherlands epam.com CONFIDENTIALITY CAUTION AND DISCLAIMER This message is intended only for the use of the individual(s) or entity(ies) to which it is addressed and contains information that is legally privileged and confidential. If you are not the intended recipient, or the person responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. All unintended recipients are obliged to delete this message and destroy any printed copies.
_______________________________________________ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general