Re: [sword-devel] Creating a version of the BSB module with interlinear support

David Haslam Sat, 30 Sep 2023 03:01:42 -0700

Hi Timothy,

Please consult the developers’ wiki


https://wiki.crosswire.org/

And consult the page about OSIS Bibles.

David

Sent from [Proton Mail](https://proton.me/mail/home) for iOS

On Sat, Sep 30, 2023 at 10:54, Timothy Allen <[thrist...@gmail.com](mailto:On 
Sat, Sep 30, 2023 at 10:54, Timothy Allen <<a href=)> wrote:

> The Berean Standard Bible is available in two machine-readable formats: USFM, 
> and "translation tables", a 40MB Excel spreadsheet with a row for every 
> Hebrew or Greek word in their chosen source texts with the English text it's 
> translated to. I would like to make one module with the nice formatting of 
> the USFM sources and the metadata from the spreadsheet, so I've spent the 
> last few weeks writing a script that runs through them both in parallel and 
> makes sure everything lines up, so I'm now confident that I have an accurate 
> mapping between them.
>
> My question now is, how can I translate the data from the spreadsheet into 
> OSIS?
>
> Here's the information the spreadsheet gives me:
>
> Column        Example Notes
> he_ordinal    1       "Hebrew Ordinal", increments for each spreadsheet row 
> in the Old Testament, set to 999999 for each row in the New Testament
> el_ordinal    0       "Greek Ordinal", set to 0 for each row in the Old 
> Testament, increments for each row in the New Testament, except for Mark 1:1 
> which has a word with the number 18379.5 (presumably something needed to be 
> inserted and they didn't want to renumber everything else)
> en_ordinal    1       "English Ordinal", increments for each spreadsheet row 
> (except for that word in Mark 1:1)
> language      Hebrew  "Hebrew", "Greek", or sometimes "Aramaic"
> verse_ordinal 1       Increments for each verse in the Bible, so every word 
> in Genesis 1:1 has "1", etc.
> source_word   בְּרֵאשִׁ֖ית    The word in the original source text. Sometimes 
> includes fancy brackets to mark sources other than WLC or Nestle 1904: {TR} 
> ⧼RP⧽ (WH) 〈NE〉 [NA] ‹SBL› [[ECM]]
> transliteration       bə·rê·šîṯ       A transliteration of the source word 
> into the Latin alphabet
> grammar_code  Prep-b | N-fs   A code describing the grammatical form of the 
> word; these don't appear to be Robinson codes, but their own custom thing for 
> Hebrew (https://biblehub.com/hebrewparse.htm) and Greek 
> (https://biblehub.com/abbrev.htm)
> grammar_description   Preposition-b | Noun - feminine singular        The 
> grammar code, unabbreviated
> strongs_number        7225    The Strongs number of the basic form of this 
> word
> translation   In the beginning        The English text that appears in the BSB
> gloss 1) first, beginning, best, chief
> 1a) beginning
> 1b) first
> 1c) chief
> 1d) choice part       A definition from the Brown-Driver-Briggs Hebrew 
> Lexicon, or Thayer's Greek Definitions, as appropriate
>
> Looking at the OSIS 2.1.1 User's Manual (and sniffing around in the KJVA 
> module), to represent this information in OSIS I should use the <w> element, 
> which supports the following attributes (copy/pasted from the Manual):
>
> - gloss Record comments on a particular word or its usage.
> - lemma Use to record the base form of a word.
> - morph Use to record grammatical information for a word.
> - POS Use to record the function of a word according to a particular view of 
> the language's syntax.
> - src Use to record origin of the word.
> - xlit Use to record a transliteration of a word.
>
> The first problem is that sometimes multiple source words are translated into 
> a single English span, and it's not made clear how to express that in these 
> attributes. From poking around in the KJVA module, I get the impression these 
> are supposed to be space-delimited lists. Is that correct?
>
> Assuming that's the case, here's my guesses at how to fill out these 
> attributes for each span:
>
> - gloss can't be done, because each gloss contains spaces which means the 
> displaying app can't figure out which part of the gloss goes with which word
> - lemma is where Strongs numbers go; Greek Strongs numbers should be prefixed 
> with "G" and Hebrew/Aramaic ones with "H0"
> - morph might be used for the "grammar code" content, but I would probably 
> need to figure out how to translate them into Robinson codes first, since 
> that seems to be the only morphological dictionary module in the Crosswire 
> repositories
> - POS is unclear to me, I don't see how it differs from the "morph" attribute
> - src is also unclear: is this for the word order (he_ordinal or el_ordinal, 
> possibly numbered from the beginning of the verse rather than the beginning 
> of the entire Bible) or the actual choice of source text (Nestle1904, TR, NA, 
> SBL, etc.)?
> - xlit clearly comes from the "transliteration" field
>
> One thing that's clearly missing is where to put the source word. How does 
> that work?
>
> Is there other way to represent information that doesn't fit into the <w> 
> element? I'd like this module to be as useful as possible, so I'm hesitant to 
> toss out any information that can be usefully represented.
>
> Is there anything else I've missed or misunderstood?
>
> Timothy.

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Re: [sword-devel] Creating a version of the BSB module with interlinear support

Reply via email to