Re: [sword-devel] Creating a version of the BSB module with interlinear support

David Haslam Mon, 02 Oct 2023 00:50:55 -0700

Morphology is not restricted to Robinson.

The wiki page merely gave that as an example.


A different morphology dictionary could be specified in the OSIS header.

That can be done even before any such dictionary module has been created.

David

Sent from [Proton Mail](https://proton.me/mail/home) for iOS

On Mon, Oct 2, 2023 at 08:38, Timothy Allen <[thrist...@gmail.com](mailto:On 
Mon, Oct 2, 2023 at 08:38, Timothy Allen <<a href=)> wrote:

> Ah, thanks. I did look at that page when I started making my module, but I'd 
> forgotten about it by the time I needed this more detailed advice. Thanks for 
> reminding me! Using this to update the guesses from my original message:
>
> gloss I *might* be able to try grabbing the first word from the BDB/Thayer 
> gloss, but that seems error-prone and I probably won't bother unless somebody 
> really wants it lemma This should be used for Strongs numbers, marked up as 
> "strong:G123" or "strong:H123", but could also be used for storing the 
> original source text as "lemma.BSB:בְּרֵאשִׁ֖ית" if we assume a hypothetical 
> lexicon that indexes all the words in the BSB. morph This should be used for 
> Robinson morphology codes, so I should not bother with this until I can 
> figure out how to translate the BSB's codes to Robinson ones. The wiki page 
> also has "strongMorph" codes in its examples, but I can't find any extra 
> information on what system this might refer to. Apparently there aren't any 
> Hebrew morphology lexicons available for SWORD; maybe someday I could make 
> one? POS Still unclear to me, it's not mentioned on the wiki page src 
> Apparently this is for word order in the source language, but it's not at all 
> clear where "word 1" is. The start of the <w> element? The start of the 
> verse? The start of the chapter? The start of the book? The start of the 
> Bible? Does it not matter, because front-ends are intended to just sort the 
> words they have? xlit Still for the transliteration, simply enough.
>
> According to the wiki page, there's also an "n" attribute not mentioned in 
> the official OSIS docs, which is for "marking enumerated words". I don't know 
> what this means, and the wiki page doesn't include any examples. I'm going to 
> guess I don't need it.
>
> Do I have all that right? Is there anything I've misunderstood?
>
> Also, would it be better to have "lemma.BSB:בְּרֵאשִׁ֖ית" and use the same 
> "BSB" lexicon for every word in the entire text, or would it be more 
> appropriate to use "lemma.WLC:בְּרֵאשִׁ֖ית" and use different lexicons to 
> indicate the different sources used for the translation (Nestle1904, TR, NA, 
> SBL, etc.)?
>
> Timothy
>
> On 30/9/23 20:00, David Haslam wrote:
>
>> Hi Timothy,
>>
>> Please consult the developers’ wiki
>>
>> https://wiki.crosswire.org/
>>
>> And consult the page about OSIS Bibles.
>>
>> David
>>
>> Sent from [Proton Mail](https://proton.me/mail/home) for iOS
>>
>> On Sat, Sep 30, 2023 at 10:54, Timothy Allen 
>> <[thrist...@gmail.com](mailto:On Sat, Sep 30, 2023 at 10:54, Timothy Allen 
>> <<a href=)> wrote:
>>
>>> The Berean Standard Bible is available in two machine-readable formats: 
>>> USFM, and "translation tables", a 40MB Excel spreadsheet with a row for 
>>> every Hebrew or Greek word in their chosen source texts with the English 
>>> text it's translated to. I would like to make one module with the nice 
>>> formatting of the USFM sources and the metadata from the spreadsheet, so 
>>> I've spent the last few weeks writing a script that runs through them both 
>>> in parallel and makes sure everything lines up, so I'm now confident that I 
>>> have an accurate mapping between them.
>>>
>>> My question now is, how can I translate the data from the spreadsheet into 
>>> OSIS?
>>>
>>> Here's the information the spreadsheet gives me:
>>>
>>> Column      Example Notes
>>> he_ordinal  1       "Hebrew Ordinal", increments for each spreadsheet row 
>>> in the Old Testament, set to 999999 for each row in the New Testament
>>> el_ordinal  0       "Greek Ordinal", set to 0 for each row in the Old 
>>> Testament, increments for each row in the New Testament, except for Mark 
>>> 1:1 which has a word with the number 18379.5 (presumably something needed 
>>> to be inserted and they didn't want to renumber everything else)
>>> en_ordinal  1       "English Ordinal", increments for each spreadsheet row 
>>> (except for that word in Mark 1:1)
>>> language    Hebrew  "Hebrew", "Greek", or sometimes "Aramaic"
>>> verse_ordinal       1       Increments for each verse in the Bible, so 
>>> every word in Genesis 1:1 has "1", etc.
>>> source_word בְּרֵאשִׁ֖ית    The word in the original source text. Sometimes 
>>> includes fancy brackets to mark sources other than WLC or Nestle 1904: {TR} 
>>> ⧼RP⧽ (WH) 〈NE〉 [NA] ‹SBL› [[ECM]]
>>> transliteration     bə·rê·šîṯ       A transliteration of the source word 
>>> into the Latin alphabet
>>> grammar_code        Prep-b | N-fs   A code describing the grammatical form 
>>> of the word; these don't appear to be Robinson codes, but their own custom 
>>> thing for Hebrew (https://biblehub.com/hebrewparse.htm) and Greek 
>>> (https://biblehub.com/abbrev.htm)
>>> grammar_description Preposition-b | Noun - feminine singular        The 
>>> grammar code, unabbreviated
>>> strongs_number      7225    The Strongs number of the basic form of this 
>>> word
>>> translation In the beginning        The English text that appears in the BSB
>>> gloss       1) first, beginning, best, chief
>>> 1a) beginning
>>> 1b) first
>>> 1c) chief
>>> 1d) choice part     A definition from the Brown-Driver-Briggs Hebrew 
>>> Lexicon, or Thayer's Greek Definitions, as appropriate
>>>
>>> Looking at the OSIS 2.1.1 User's Manual (and sniffing around in the KJVA 
>>> module), to represent this information in OSIS I should use the <w> 
>>> element, which supports the following attributes (copy/pasted from the 
>>> Manual):
>>>
>>> - gloss Record comments on a particular word or its usage.
>>> - lemma Use to record the base form of a word.
>>> - morph Use to record grammatical information for a word.
>>> - POS Use to record the function of a word according to a particular view 
>>> of the language's syntax.
>>> - src Use to record origin of the word.
>>> - xlit Use to record a transliteration of a word.
>>>
>>> The first problem is that sometimes multiple source words are translated 
>>> into a single English span, and it's not made clear how to express that in 
>>> these attributes. From poking around in the KJVA module, I get the 
>>> impression these are supposed to be space-delimited lists. Is that correct?
>>>
>>> Assuming that's the case, here's my guesses at how to fill out these 
>>> attributes for each span:
>>>
>>> - gloss can't be done, because each gloss contains spaces which means the 
>>> displaying app can't figure out which part of the gloss goes with which word
>>> - lemma is where Strongs numbers go; Greek Strongs numbers should be 
>>> prefixed with "G" and Hebrew/Aramaic ones with "H0"
>>> - morph might be used for the "grammar code" content, but I would probably 
>>> need to figure out how to translate them into Robinson codes first, since 
>>> that seems to be the only morphological dictionary module in the Crosswire 
>>> repositories
>>> - POS is unclear to me, I don't see how it differs from the "morph" 
>>> attribute
>>> - src is also unclear: is this for the word order (he_ordinal or 
>>> el_ordinal, possibly numbered from the beginning of the verse rather than 
>>> the beginning of the entire Bible) or the actual choice of source text 
>>> (Nestle1904, TR, NA, SBL, etc.)?
>>> - xlit clearly comes from the "transliteration" field
>>>
>>> One thing that's clearly missing is where to put the source word. How does 
>>> that work?
>>>
>>> Is there other way to represent information that doesn't fit into the <w> 
>>> element? I'd like this module to be as useful as possible, so I'm hesitant 
>>> to toss out any information that can be usefully represented.
>>>
>>> Is there anything else I've missed or misunderstood?
>>>
>>> Timothy.
>>
>> _______________________________________________
>> sword-devel mailing list:
>> sword-devel@crosswire.org
>>
>> http://crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Re: [sword-devel] Creating a version of the BSB module with interlinear support

Reply via email to