Hi Wendell,

It's the total number of entities.

The document where I first had this experience has about five thousand 
instances of the non-breaking-space HTML entity in it; converting those to 
actual non-breaking spaces made the problem go away. No nested entities 
whatsoever.

-- Graydon

On Thu, Jan 8, 2026, at 09:14, Wendell Piez via BaseX-Talk wrote:
> Hello Christian,
> 
> Please forgive a slightly OT question, for background - is the parser 
> limiting the expansion of entities with respect to their total count, or only 
> with respect to their nesting (one entity invoking another) i.e. invocation 
> depth? (So I can have more than 2500 entities, just not 2500 deep.)
> 
> I kind of assumed it was the invocation depth, am I wrong? Or do parsers have 
> settings for both?
> 
> (With thanks to you, Graydon and the list for the public conversation.)
> 
> Regards, Wendell
> 
> 
> 
> 
> On Thu, Jan 8, 2026 at 4:13 AM Christian Grün via BaseX-Talk 
> <[email protected]> wrote:
>> Hi Graydon,
>> 
>> You are right, Java imposes various limits on the XML parser that get 
>> stricter and more fine granular with every version of the language [1].
>> 
>> Currently, there are two ways to tackle this:
>> 
>> • The properties can be overwritten when starting BaseX on command line, for 
>> example:
>> 
>>   -Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0 
>> 
>> The properties can be added to the BaseX start scripts or assigned to the 
>> BASEX_JVM environment variable before starting BaseX.
>> 
>> • You can use our internal BaseX XML parser, either by enabling the INTPARSE 
>> option, or by switching to the »Parsing« tab in the »Create Database« dialog 
>> of the GUI and activating the corresponding checkbox.
>> 
>> In a future version of BaseX, we may introduce a global option to invalidate 
>> the limits. As BaseX is a tool for XML experts, we could also invalidate the 
>> Java options by default. Feedback from everyone is welcome.
>> 
>> Best,
>> Christian
>> 
>> [1] 
>> https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.html#Properties
>> 
>> 
>> *Von:* Graydon Saunders via BaseX-Talk <[email protected]>
>> *Gesendet:* Dienstag, 6. Januar 2026 18:11
>> *An:* BaseX <[email protected]>
>> *Betreff:* [basex-talk] the right way to respond to JAXP00010001
>>  
>> Hello (and Happy New Year!)
>> 
>> I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment 
>> (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8)
>> 
>> I've got some data, which I want BaseX to load by the individual document 
>> using the doc() function.
>> 
>> On one of these documents I get a parsing failure that reports JAXP00010001; 
>> if I look that up, I find 
>> https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which 
>> says this limit changed (from a traditional larger number) to 2500, so now 
>> the error text is
>> `JAXP00010001: The parser has encountered more than "2500" entity expansions 
>> in this document; this is the limit imposed by the JDK`
>> 
>> It's a large file and I can't do anything about that part, nor can I do 
>> anything about the number of entity references these files happen to have 
>> when I get them. (In this particular case, a bit more than five thousand.) 
>> The Oracle page lists a bunch of options for how to set a different entity 
>> expansion limit.
>> 
>> In context of BaseX, what's the right way to adjust the entity expansion 
>> limit (in my case, generally the BaseX GUI) so these files will parse?
>> 
>> Thanks!
>> Graydon
> 
> 
> --
> ...Wendell Piez... ...wendellpiez.com...
> ...pellucidliterature.org... ...pausepress.org... ...github.com/wendellpiez...

Reply via email to