I just fixed two bugs in osis2mod. The first was that it did not expect <title/> and treated it as an open tag. At the end of the parent element, it noticed that there was not a corresponding end tag.
The second was that > is a valid character in text and does not need to be escaped. For example, in parsing the following: <verse osisID="Deut.32.27">sed arrogantiam inimicorum timui, ne superbirent hostes eorum et dicerent: >Manus nostra excelsa, et non Dominus fecit haec omnia!'.</verse> It found three tags: <verse osisID="Deut.32.27"> <verse osisID="Deut.32.27">sed arrogantiam inimicorum timui, ne superbirent hostes eorum et dicerent: > </verse> The second of which did not have a closing tag! So osis2mod found the error much too late (i.e. at the end of the parent element). Using a 3rd-party parser (validating or otherwise) would not have these problems. Who knows how many more parsing bugs are present. In Him, DM Chris Little wrote: > DM and I have been chatting a bit off-list about the future/function of > osis2mod and I thought maybe we should open up the discussion a bit. > > Right now osis2mod (the tool for converting OSIS Bibles to Sword Bible > modules) does some mediocre validity checking as it builds its Sword > database. We'll never really get it perfect this way since we aren't > doing real schema validation. > > DM has suggested adding a real validating parser to osis2mod (by > embedding something like xerces or libxml), so it could spit out an > error message if you try to import invalid OSIS. > > I'm not totally convinced we should do that. When I prepare modules from > OSIS docs, I always perform validation in an external validator. > (Personally I use Oxygen, but there are also XML Spy, MSV, topologi, > Xerces, etc.) > > Do people feel that incorporating a real validator would make osis2mod > easier to use? > > It could potentially cause the filesize to jump dramatically, so would > that be acceptable? > > If we incorporate osis2mod into either front-ends or installmgr so that > users could import OSIS documents directly into Sword, would that > support or detract from the case for embedding a full validator? > > --Chris > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > > _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page