A year ago I did some initial work developing a PHP implementation for such a parser: http://open-scriptures.googlecode.com/svn/branches/php-prototypes/reference-parser.lib.php
Maybe the algorithms would be of use to you. I only refined it for New Testament references. On Fri, Nov 20, 2009 at 5:28 AM, DM Smith <dmsm...@crosswire.org> wrote: > SWORD's ability to parse arbitrary input into a list of verses is awesome. > It is far more powerful than what is needed for an osisID or an osisRef. > > The structure of these for biblical references is very well defined. > > Here is a partial BNF for it. (I've simplified/extended the BNF with [ ] to > represent optional instead of using ε for the empty production and allow > them to be anywhere.) > > # An osisRef can be a space separated list of osisRefs > # or two osisIDs separated by a dash > osisRef ::= <osisRef> " " <osisRef> > | <osisID> [ "-" <osisID> ] > > # An osisID is a reference with optional work prefix and/or grain, > osisID ::= [ <workPrefix> ":" ] <reference> [ "!" <grain> ] > > # A reference has a book name and can be followed by a chapter and a verse, > separated by a period '.' > reference ::= <bookname> > | <bookname> "." <number> > | <bookname> "." <number> "." <number> > > #Book names are normalized to a particular list, including the > deuterocanonical books. > bookname := "Gen" | "Exod" | "Lev" | ... skipping for > brevity... | "Rev" > > # the numbers are a nonzero and never have leading zeros > number ::= <nzdigit> [ <digits> ] > > nzdigit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" > digit ::= "0" | <nzdigit> > > digits ::= <digit> [ <digits> ] > > workPrefix ::= ..... > grain ::= ..... > > I'd like to write parseOsisRef and parseOsisID and use it within osis2mod. > Right now, I have to munge the osisRefs and osisIDs to a form that > ParseVerseList will understand. > > The code will be much simpler and much faster than ParseVerseList. Here are > some of the specialties of ParseVerseList that don't need to be handled. > a) It understands internationalized book names > b) It understands all kinds of abbreviations for book names > c) It allows roman numerals in book names. > d) It does not require a book name for a reference, but uses the last seen > reference's book name as a basis. > e) Likewise, it does not require a chapter number for a verse references, > but uses the last seen reference's book and chapter as a basis. > f) It allows special constructs such as "v 3", "c 4" and "9f" and "12ff" > for verse, chapter, next verse and to the end of the chapter. (There are > other special constructs.) > > Any input? > > In Him, > DM > > > > > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page