The upshot is that the index to the start and length of a verse may result in a fragment that is not well formed XML.
Maybe I am missing something, but I see only a few solutions:
1) change the meaning of the index so that it results in well-formed XML. This well-formed XML will contain the verse or be the verse. Software will have to adjust for this.
2) re-encode the encoded bible so that elements are never split by verses, but they are transformed into marker elements.
3) do a prepass over the fragment for begin or end tags without the corresponding part and artificially add it.
4) Instead of artificially adding it, progressively blurring the passage until it is valid.
5) strip them out. (I think that the code does this as part of a multi-step recovery mechanism)
6) get the verse out of the book as a whole and then find the nearest ancestor element that fully contains the verse. (can't reliably do chapters since paragraphs may be split across chapter boundaries.)
7) Sidestep it (for the most part) by presenting a verse in the context of the entire chapter. (I say "for the most part" because a paragraph can cross chapter boundaries). But, f we have the code to deal with chapters, then we will have the code to deal with verses.
... any other ...
The first two are out of JSword's control. The others we can do in JSword.
Any thoughts/response?
Assuming that the fragment is well formed XML there are two related problems:
What should we do in presenting a fragment that has begin markers for non-verse elements but not the corresponding end element?
Likewise for verses that contain the end element but not the corresponding begin element?
These don't prevent the verse from being well formed, but they do prevent it from being fully meaningful.
I will be talking to a friend who is an expert in Legal XML, where they have a similar mechanism to see what their community feels is the best practice. He has mentioned that it is a best practice to declare a primary container model and everything that can cross those boundaries uses marker elements. For example, the primary container model differs from one work to another and for a work it may be document, chapter, section, sub-section, paragraph with pages and lines being markers.
Does anyone know of a best practice for OSIS, or any other XML field?
Thanks, DM
_______________________________________________ sword-devel mailing list [EMAIL PROTECTED] http://www.crosswire.org/mailman/listinfo/sword-devel
