Hi and thanks for the reply,

[reposting it to link it to the correct discussion]

In my first post I talked about using xdmp:document-insert, but it should have 
been xdmp:document-load. Sorry for the confusion.

Below are some working examples of both cases:

--- XXE EXAMPLE ---
File : xxe-example.xml
<?xml version="1.0" encoding="ISO-8859-1"?>  <!DOCTYPE foo [  
   <!ELEMENT foo ANY >
   <!ENTITY xxe SYSTEM "file:///d:/xxe-inject-me.xml" >]><foo>&xxe;</foo>

File : xxe-inject-me.xml
<?xml version="1.0"?>
<lolz>hallo</lolz>

Code :
xdmp:document-load("d:\xxe-example.xml",
    map:map() => map:with("uri", "/test/xxe-result.xml"))

Result:
<?xml  version="1.0" encoding="UTF-8"?>
<foo>
<lolz>hallo
</lolz>
</foo>

So the file contents gets inserted just as requested. This would be something 
you want to block \ prevent from happening.

Note that xdmp:document-get("d:\xxe-example.xml ") produces the same 
output\behaviour.

As the file location embedded in the xml could also point to an external (http) 
location, this could be a potential risk when loading xml files. I think this 
should be addressed in both functions by adding something like 'ignore-dtd' 
option.

Using document-insert in your example reveals something interesting as the 
unquote function looks like to ignore\disable the DTD stuff, at least in this 
case. The function expects a node, so the xml is already 'processed' somewhere 
before.  

--- XML BOMB ---
File : xmlbomb.xml
<!DOCTYPE lolz [
 <!ENTITY lol "lol">
 <!ELEMENT lolz (#PCDATA)>
 <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
 <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
 <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
 <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
 <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
 <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
 <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
 <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
 <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>

Code:
xdmp:document-load("d:\xmlbomb.xml",
    map:map() => map:with("uri", "/test/xmlbomb.xml"))

Result:
Code is executed and continues running.

Interestingly, when reading the content and using the unquote function, it also 
causes the process to keep loading the file. So it really doesn't ignore all 
DTD definitions as it did when loading the xxe example.

My conclusion thus far: document-load and document-get are vulnerable to the 
exploits without an option to turn it off. Document-insert is not affected as 
it expects a node at which point the original document is already processed. 
The unquote option sometimes prevents the execution of the exploits, sometimes 
not. 

Any thoughts on the matter would be appreciated!

Thanks,
Marcel



Date: Wed, 14 Mar 2018 17:28:34 +0000
From: Keith Breinholt <breinhol...@ldschurch.org>
Subject: Re: [MarkLogic Dev General] Marklogic XXE and XML Bomb
        prevention
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Message-ID:
        
<sn1pr04mb190429cfb3c02923235f767fb8...@sn1pr04mb1904.namprd04.prod.outlook.com>
        
Content-Type: text/plain; charset="us-ascii"

Here is the closest I been able to come to inserting the "document".

xquery version "1.0-ml";
let $doc := xdmp:unquote( xdmp:filesystem-file("C:/xxeInjection.xml") )

return (
    $doc,
    xdmp:document-insert( "/xxeInjection.xml", $doc)
)

Here is the contents of the xxeInjection.xml file are exactly as you specify 
below.

However, when the file is loaded from the file system it is text and must be 
unquoted ...  xdmp:unquote() strips the invalid HTML DOCTYPE and we get:

<?xml version="1.0" encoding="UTF-8"?>
<foo>;</foo>

Could you please show us the code you used to insert the xxe injection 
"document" unmodified?

-Keith

From: Keith Breinholt
Sent: Wednesday, March 14, 2018 11:07 AM
To: general@developer.marklogic.com
Subject: RE: Marklogic XXE and XML Bomb prevention

Perhaps you could show the code that you used to insert the document into the 
database.

I, personally, cannot get your code to work for a number of reasons.  1) having 
both an xml processing statement and an HTML doctype is invalid.  2) Trying to 
assign the "document" to a variable throws an error because of #1. 3) If I try 
to put the "document" below into a file on the file system and load it I cannot 
use xdmp:document-insert() to insert the "document" into the database because 
there isn't a valid node.

There may be something I have overlooked so please share the code you used to 
insert this document into a database.

-Keith

From: 
general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 On Behalf Of Marcel de Kleine
Sent: Wednesday, March 14, 2018 6:43 AM
To: general@developer.marklogic.com<mailto:general@developer.marklogic.com>
Subject: [MarkLogic Dev General] Marklogic XXE and XML Bomb prevention

Hello,

We have noticed Marklogic is vulnerable to xxe (entity expansion) and xml bomb 
attacks. When loading an malicious document using xdmp:document-insert it won't 
catch these and cause either loading of unwanted external documents (xxe) and 
lockup of the system (xml bomb).

For example, if I load this document :
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [
   <!ELEMENT foo ANY >
   <!ENTITY xxe SYSTEM "file:///c:/text.xml" >]> <foo>&xxe;</foo>

The file test.xml gets nicely added to the xml document.

See OWASP and others for examples.

This is clearly a xml processing issue so the question is : can we disable 
this? And if so, on what levels would this be possible. Best should be 
system-wide.
( And if you cannot disable this, I think this is something ML should address 
immediately.

Thank you in advance,
Marcel de Kleine, EPAM

Marcel de Kleine
Senior Software Engineer

Office: +31 20 241 6134 x 30530<tel:+31%2020%20241%206134;ext=30530>   Cell: 
+31 6 14806016<tel:+31%206%2014806016>   Email: 
marcel_de_kle...@epam.com<mailto:marcel_de_kle...@epam.com>
Delft, Netherlands   
epam.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.epam.com&d=DwMFAg&c=z0adcvxXWKG6LAMN6dVEqQ&r=wQ09nIebnRJGH1VgSesPfFnvXo10BKdu-taGZQaoghw&m=yiUEuOMjMBUR5ccv3Gi1vFMsW6pyEFhtMdzfpZtXd7g&s=a20FyQ4Tr_pZurrcjmEjQUs0A9Nd3NR48cC-wrqcKGA&e=>

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or entity(ies) 
to which it is addressed and contains information that is legally privileged 
and confidential. If you are not the intended recipient, or the person 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. All unintended recipients are obliged to 
delete this message and destroy any printed copies.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20180314/dd673667/attachment-0001.html
 


_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to