Thanks. I changed the value to 1 but that made no difference -- the exploit still works.
I can see that the JenaXMLInput class is related -- but I'm not sure how it applies to my code if I'm not using XMLReader directly? https://github.com/apache/jena/blob/main/jena-core/src/main/java/org/apache/jena/util/JenaXMLInput.java Would a test case using RDFParser help? On Sun, Nov 16, 2025 at 6:40 PM Andy Seaborne <[email protected]> wrote: > > > > System.setProperty("jdk.xml.entityExpansionLimit", "0"); > > https://docs.oracle.com/javase/tutorial/jaxp/limits/limits.html > "A value less than or equal to 0 indicates no limit." > > Andy > > On 16/11/2025 16:44, Martynas Jusevičius wrote: > > Hi, > > > > I want to protect my RDF/XML I/O code against Billion laughs, external > > DTD and similar exploits. Using Jena 4.7.0. > > > > The reader code looks like this: > > > > public Model read(Model model, InputStream is, Lang lang, String > > baseURI, ErrorHandler errorHandler) > > { > > RDFParser parser = RDFParser.create(). > > lang(lang). > > errorHandler(errorHandler). > > checking(true). // otherwise exceptions will not be thrown for > > invalid URIs! > > base(baseURI). > > source(is). > > build(); > > > > parser.parse(StreamRDFLib.graph(model.getGraph())); > > > > return model; > > } > > > > I have a script that submits RDF/XML with Billion laughs (recursive > > entity expansion) and that causes the Java application to run out of > > memory: > > > > Caused by: java.lang.OutOfMemoryError: Java heap space > > at java.base/java.util.Arrays.copyOf(Arrays.java:3537) > > at > > java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:228) > > at > > java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:740) > > at java.base/java.lang.StringBuffer.append(StringBuffer.java:410) > > at > > org.apache.jena.rdfxml.xmlinput.states.AbsWantLiteralValueOrDescription.characters(AbsWantLiteralValueOrDescription.java:62) > > at > > org.apache.jena.rdfxml.xmlinput.states.WantLiteralValueOrDescription.characters(WantLiteralValueOrDescription.java:77) > > at > > org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.characters(XMLHandler.java:137) > > at org.apache.xerces.parsers.AbstractSAXParser.characters(Unknown Source) > > at org.apache.xerces.impl.dtd.XMLDTDValidator.characters(Unknown Source) > > at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown > > Source) > > at > > org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown > > Source) > > at > > org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown > > Source) > > at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) > > at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) > > at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) > > at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) > > at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) > > at > > org.apache.jena.rdfxml.xmlinput.impl.RDFXMLParser.parse(RDFXMLParser.java:96) > > at org.apache.jena.rdfxml.xmlinput.ARP.load(ARP.java:118) > > at > > org.apache.jena.riot.lang.ReaderRIOTRDFXML.parse(ReaderRIOTRDFXML.java:186) > > at org.apache.jena.riot.lang.ReaderRIOTRDFXML.read(ReaderRIOTRDFXML.java:84) > > at org.apache.jena.riot.RDFParser.read(RDFParser.java:416) > > at org.apache.jena.riot.RDFParser.parseNotUri(RDFParser.java:406) > > at org.apache.jena.riot.RDFParser.parse(RDFParser.java:356) > > at com.atomgraph.core.io.ModelProvider.read(ModelProvider.java:113) > > at com.atomgraph.core.io.ModelProvider.read(ModelProvider.java:96) > > at com.atomgraph.core.io.ModelProvider.readFrom(ModelProvider.java:90) > > at com.atomgraph.core.io.ModelProvider.readFrom(ModelProvider.java:53) > > at > > org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$TerminalReaderInterceptor.invokeReadFrom(ReaderInterceptorExecutor.java:233) > > at > > org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$TerminalReaderInterceptor.aroundReadFrom(ReaderInterceptorExecutor.java:212) > > at > > org.glassfish.jersey.message.internal.ReaderInterceptorExecutor.proceed(ReaderInterceptorExecutor.java:132) > > at > > org.glassfish.jersey.message.internal.MessageBodyFactory.readFrom(MessageBodyFactory.java:1072) > > > > I have tried the following config (and its alternative in > > CATALINA_OPTS), but they do not seem to have any effect -- the exploit > > still works: > > > > System.setProperty("javax.xml.stream.isSupportingExternalEntities", > > "false"); > > System.setProperty("javax.xml.accessExternalDTD", ""); > > System.setProperty("javax.xml.accessExternalSchema", ""); > > System.setProperty("jdk.xml.entityExpansionLimit", "0"); > > > > What is the solution here? > > I would hate to have to single out RDF/XML and handle it specially, > > but I'll do it if it's necessary in order to solve this. > > > > Thanks. > > > > Martynas > > atomgraph.com >
