I managed to write a test case that causes a similar bug to the one you
reported. I’ll keep you updated.


On Thu, Jan 30, 2014 at 6:33 PM, Martin Reckziegel <
[email protected]> wrote:

>  Hello Christian,
>
> thanks for your answer. I managed to solve the problem using the latest
> snapshot, but there are some issues/notes i want to share.
> First it seems (either in 7.7.2 nor 7.8 beta) not possible to change the
> parser options (at least there were no changes in behaviour)
> I'm running basex using the bin/basexhttp script. If i change the intparse
> or dtd option using bin/basexclient they are restored to default when
> restarting the server, i'm not sure wether this is desired behaviour or
> not. But even without restart its not possible to get the questioned xmls
> parsed in 7.7.2.
>
> The second note is that the latest snapshot is having some serious
> concurrency issues which 7.7.2 doesn't have.
> I am using a node.js environment to PUT around 10000 xml files to the db.
> If i start those PUT requests all at once (i have no idea how node
> internally queues them or if it fires them all at once on the network) i
> get these Exceptions after a few successful PUTs with the latest snapshot:
>
>
> Improper use? Potential bug? Your feedback is welcome:
> Contact: [email protected]
> Version: BaseX 7.8 beta 4cfa54c
> Java: Oracle Corporation, 1.7.0_25
> OS: Linux, amd64
> Stack Trace:
> java.lang.RuntimeException: Data Access out of bounds:
> - pre value: 1950001
> - #used blocks: 7618
> - #total locks: 7618
> - access: 7617 (7618 > 7617]
>       at org.basex.util.Util.notExpected(Util.java:53)
>       at org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:508)
>       at org.basex.io.random.TableDiskAccess.read5(TableDiskAccess.java:216)
>       at org.basex.data.Data.textOff(Data.java:422)
>       at org.basex.data.DiskData.text(DiskData.java:234)
>       at org.basex.core.cmd.List.listDB(List.java:132)
>       at org.basex.core.cmd.List.run(List.java:50)
>       at org.basex.core.Command.run(Command.java:329)
>       at org.basex.http.rest.RESTCmd.run(RESTCmd.java:93)
>       at org.basex.http.rest.RESTCmd.run(RESTCmd.java:82)
>       at org.basex.http.rest.RESTRetrieve.run0(RESTRetrieve.java:51)
>       at org.basex.http.rest.RESTCmd.run(RESTCmd.java:61)
>       at org.basex.core.Command.run(Command.java:329)
>       at org.basex.core.Command.execute(Command.java:94)
>       at org.basex.core.Command.execute(Command.java:117)
>       at org.basex.http.rest.RESTServlet.run(RESTServlet.java:21)
>       at org.basex.http.BaseXServlet.service(BaseXServlet.java:58)
>       ....
>
>
> sometimes the collection is not even accessible per GET afterwards (other
> collections are).
> PUTting the xml files one by one and waiting for the last result first
> however works fine.
> 7.7.2 doesn't have this issue, so is this maybe some regression bug?
>
> best,
> Martin
>
>
>
> On 28.01.2014 23:59, Christian Grün wrote:
>
> An update: I noticed that external entity references were resolved by
> the parser even if DTD parsing was switched off, leading to long
> waiting times. The issue is resolved in the very latest snapshot, both
> with the internal and Java’s default parser. If you still want to
> parse all entities, simply activate DTD parsing.
>
>
> On Tue, Jan 28, 2014 at 6:44 PM, Christian Grün<[email protected]> 
> <[email protected]> wrote:
>
>  Hi Martin,
>
> thanks for your feedback. The problem should be solved with Version
> 7.8 of BaseX. The official version will be out soon, but you are
> invited to check out the latest stable snapshot [1].
>
> If you want to use BaseX 7.7.2, you can also switch to Java’s default
> parser (via SET INTPARSE false, or by deactivating "Use internal XML
> parser" in the "Database" → "New…" dialog and the "Parsing" tab).
>
> Hope this helps,
> Christian
>
> [1] http://files.basex.org/releases/latest/
>
>
> On Tue, Jan 28, 2014 at 6:36 PM, Martin 
> Reckziegel<[email protected]> 
> <[email protected]> wrote:
>
>  Hello everybody,
>
> i'm using basex 7.7.2 in a university based project. I'm trying to store TEI
> XML files in the database but there is an error storing certain valid files.
> Using a rest PUT request to store a file starting like this:
>
> <?xml version="1.0"?>
> <!DOCTYPE TEI.2 PUBLIC "-//TEI P4//DTD Main DTD Driver 
> File//EN""http://www.tei-c.org/Guidelines/DTD/tei2.dtd"; 
> <http://www.tei-c.org/Guidelines/DTD/tei2.dtd> [
> <!ENTITY % TEI.XML "INCLUDE">
> <!ENTITY % PersProse PUBLIC "-//Perseus P4//DTD Perseus 
> Prose//EN""http://www.perseus.tufts.edu/DTD/1.0/PersProse.dtd"; 
> <http://www.perseus.tufts.edu/DTD/1.0/PersProse.dtd> >
> %PersProse;
> ]>
> <TEI.2>
> <teiHeader type="text" status="new">
> ....
>
> results in this error:
>
>  "tlg0003.xml.xml" (Line 5): ']' expected, '<' found.
>
>  (Line 5 is %PersProse;)
> I have no clue how to interpret the error since non of the mention
> characters are in that line. Maybe this is resulting in some internal
> replacement?
> Anyway deleting line 5 resolves the error (but of course does not solve my
> problem since i don't want to alter the files)
> The problematic files are all valid, at least according 
> tohttp://www.validome.org/xml/validate/ and http://validator.w3.org/check so i
> wonder why they are rejected by basex?
>
> kind regards,
> Martin Reckziegel
>
>
>
>
> _______________________________________________
> BaseX-Talk mailing 
> [email protected]https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
>
>
_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to