I managed to write a test case that causes a similar bug to the one you reported. I’ll keep you updated.
On Thu, Jan 30, 2014 at 6:33 PM, Martin Reckziegel < [email protected]> wrote: > Hello Christian, > > thanks for your answer. I managed to solve the problem using the latest > snapshot, but there are some issues/notes i want to share. > First it seems (either in 7.7.2 nor 7.8 beta) not possible to change the > parser options (at least there were no changes in behaviour) > I'm running basex using the bin/basexhttp script. If i change the intparse > or dtd option using bin/basexclient they are restored to default when > restarting the server, i'm not sure wether this is desired behaviour or > not. But even without restart its not possible to get the questioned xmls > parsed in 7.7.2. > > The second note is that the latest snapshot is having some serious > concurrency issues which 7.7.2 doesn't have. > I am using a node.js environment to PUT around 10000 xml files to the db. > If i start those PUT requests all at once (i have no idea how node > internally queues them or if it fires them all at once on the network) i > get these Exceptions after a few successful PUTs with the latest snapshot: > > > Improper use? Potential bug? Your feedback is welcome: > Contact: [email protected] > Version: BaseX 7.8 beta 4cfa54c > Java: Oracle Corporation, 1.7.0_25 > OS: Linux, amd64 > Stack Trace: > java.lang.RuntimeException: Data Access out of bounds: > - pre value: 1950001 > - #used blocks: 7618 > - #total locks: 7618 > - access: 7617 (7618 > 7617] > at org.basex.util.Util.notExpected(Util.java:53) > at org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:508) > at org.basex.io.random.TableDiskAccess.read5(TableDiskAccess.java:216) > at org.basex.data.Data.textOff(Data.java:422) > at org.basex.data.DiskData.text(DiskData.java:234) > at org.basex.core.cmd.List.listDB(List.java:132) > at org.basex.core.cmd.List.run(List.java:50) > at org.basex.core.Command.run(Command.java:329) > at org.basex.http.rest.RESTCmd.run(RESTCmd.java:93) > at org.basex.http.rest.RESTCmd.run(RESTCmd.java:82) > at org.basex.http.rest.RESTRetrieve.run0(RESTRetrieve.java:51) > at org.basex.http.rest.RESTCmd.run(RESTCmd.java:61) > at org.basex.core.Command.run(Command.java:329) > at org.basex.core.Command.execute(Command.java:94) > at org.basex.core.Command.execute(Command.java:117) > at org.basex.http.rest.RESTServlet.run(RESTServlet.java:21) > at org.basex.http.BaseXServlet.service(BaseXServlet.java:58) > .... > > > sometimes the collection is not even accessible per GET afterwards (other > collections are). > PUTting the xml files one by one and waiting for the last result first > however works fine. > 7.7.2 doesn't have this issue, so is this maybe some regression bug? > > best, > Martin > > > > On 28.01.2014 23:59, Christian Grün wrote: > > An update: I noticed that external entity references were resolved by > the parser even if DTD parsing was switched off, leading to long > waiting times. The issue is resolved in the very latest snapshot, both > with the internal and Java’s default parser. If you still want to > parse all entities, simply activate DTD parsing. > > > On Tue, Jan 28, 2014 at 6:44 PM, Christian Grün<[email protected]> > <[email protected]> wrote: > > Hi Martin, > > thanks for your feedback. The problem should be solved with Version > 7.8 of BaseX. The official version will be out soon, but you are > invited to check out the latest stable snapshot [1]. > > If you want to use BaseX 7.7.2, you can also switch to Java’s default > parser (via SET INTPARSE false, or by deactivating "Use internal XML > parser" in the "Database" → "New…" dialog and the "Parsing" tab). > > Hope this helps, > Christian > > [1] http://files.basex.org/releases/latest/ > > > On Tue, Jan 28, 2014 at 6:36 PM, Martin > Reckziegel<[email protected]> > <[email protected]> wrote: > > Hello everybody, > > i'm using basex 7.7.2 in a university based project. I'm trying to store TEI > XML files in the database but there is an error storing certain valid files. > Using a rest PUT request to store a file starting like this: > > <?xml version="1.0"?> > <!DOCTYPE TEI.2 PUBLIC "-//TEI P4//DTD Main DTD Driver > File//EN""http://www.tei-c.org/Guidelines/DTD/tei2.dtd" > <http://www.tei-c.org/Guidelines/DTD/tei2.dtd> [ > <!ENTITY % TEI.XML "INCLUDE"> > <!ENTITY % PersProse PUBLIC "-//Perseus P4//DTD Perseus > Prose//EN""http://www.perseus.tufts.edu/DTD/1.0/PersProse.dtd" > <http://www.perseus.tufts.edu/DTD/1.0/PersProse.dtd> > > %PersProse; > ]> > <TEI.2> > <teiHeader type="text" status="new"> > .... > > results in this error: > > "tlg0003.xml.xml" (Line 5): ']' expected, '<' found. > > (Line 5 is %PersProse;) > I have no clue how to interpret the error since non of the mention > characters are in that line. Maybe this is resulting in some internal > replacement? > Anyway deleting line 5 resolves the error (but of course does not solve my > problem since i don't want to alter the files) > The problematic files are all valid, at least according > tohttp://www.validome.org/xml/validate/ and http://validator.w3.org/check so i > wonder why they are rejected by basex? > > kind regards, > Martin Reckziegel > > > > > _______________________________________________ > BaseX-Talk mailing > [email protected]https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk > > >
_______________________________________________ BaseX-Talk mailing list [email protected] https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

