Graydon, That is what I was afraid you were going to say. It is a low limit.
(It is too bad too, since it prevents us from using this parser feature to determine how deep entity nesting actually goes.) On the other hand, as you are saying, flat entities can usually also be normalized away early. Thanks! Wendell On Thu, Jan 8, 2026 at 9:20 AM Graydon Saunders via BaseX-Talk < [email protected]> wrote: > Hi Wendell, > > It's the total number of entities. > > The document where I first had this experience has about five thousand > instances of the non-breaking-space HTML entity in it; converting those to > actual non-breaking spaces made the problem go away. No nested entities > whatsoever. > > -- Graydon > > On Thu, Jan 8, 2026, at 09:14, Wendell Piez via BaseX-Talk wrote: > > Hello Christian, > > Please forgive a slightly OT question, for background - is the parser > limiting the expansion of entities with respect to their total count, or > only with respect to their nesting (one entity invoking another) i.e. > invocation depth? (So I can have more than 2500 entities, just not 2500 > deep.) > > I kind of assumed it was the invocation depth, am I wrong? Or do parsers > have settings for both? > > (With thanks to you, Graydon and the list for the public conversation.) > > Regards, Wendell > > > > > On Thu, Jan 8, 2026 at 4:13 AM Christian Grün via BaseX-Talk < > [email protected]> wrote: > > Hi Graydon, > > You are right, Java imposes various limits on the XML parser that get > stricter and more fine granular with every version of the language [1]. > > Currently, there are two ways to tackle this: > > • The properties can be overwritten when starting BaseX on command line, > for example: > > -Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0 > > The properties can be added to the BaseX start scripts or assigned to the > BASEX_JVM environment variable before starting BaseX. > > • You can use our internal BaseX XML parser, either by enabling the > INTPARSE option, or by switching to the »Parsing« tab in the »Create > Database« dialog of the GUI and activating the corresponding checkbox. > > In a future version of BaseX, we may introduce a global option to > invalidate the limits. As BaseX is a tool for XML experts, we could also > invalidate the Java options by default. Feedback from everyone is welcome. > > Best, > Christian > > [1] > https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.html#Properties > > ------------------------------ > > *Von:* Graydon Saunders via BaseX-Talk <[email protected] > > > *Gesendet:* Dienstag, 6. Januar 2026 18:11 > *An:* BaseX <[email protected]> > *Betreff:* [basex-talk] the right way to respond to JAXP00010001 > > Hello (and Happy New Year!) > > I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment > (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8) > > I've got some data, which I want BaseX to load by the individual document > using the doc() function. > > On one of these documents I get a parsing failure that > reports JAXP00010001; if I look that up, I find > https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which > says this limit changed (from a traditional larger number) to 2500, so now > the error text is > > JAXP00010001: The parser has encountered more than "2500" entity expansions > in this document; this is the limit imposed by the JDK > > > It's a large file and I can't do anything about that part, nor can I do > anything about the number of entity references these files happen to have > when I get them. (In this particular case, a bit more than five thousand.) > The Oracle page lists a bunch of options for how to set a different entity > expansion limit. > > In context of BaseX, what's the right way to adjust the entity expansion > limit (in my case, generally the BaseX GUI) so these files will parse? > > Thanks! > Graydon > > > > -- > ...Wendell Piez... ...wendellpiez.com... > ...pellucidliterature.org... ...pausepress.org... ... > github.com/wendellpiez... > > > -- ...Wendell Piez... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ...github.com/wendellpiez. ..

