···<date: 2012-09-18, Tuesday>···<from: Marco Patzer>···

> 2012-09-18 Philipp Gesang <ges...@stud.uni-heidelberg.de>:
> 
> > [0] http://www.mail-archive.com/ntg-context@ntg.nl/msg62855.html
> 
> Thanks for the link. Since I usually don't deal much with
> different bibliography styles I tend to skip those threads.
> 
> > >            And BibTeX is used since it understands the semantics of
> > > bib files, although a pure ConTeXt/Lua solution would be possible.
> > > Without BibTeX this functionality would be missing since no one is
> > > willing to implement a parser for .bib databases.
> > 
> > Context happens to have such a parser, written in Lua. Probably
> > the best one around:
> > 
> > ·······································································
> > \starttext
> >   \startluacode
> >     local db = bibtex.new()
> >     bibtex.load(db, "filename.bib")
> >     table.print(db)
> >   \stopluacode
> > \stoptext
> 
> Interesting, I didn't know that. But the values are only parsed, not
> interpreted. That means the only thing left for BibTeX is to do is
> to interpret the ugly “author” field?

From my bibliography (this assumes authors are separated by
“ and ”; *warning*: ashamingly ugly code ahead):

·······································································
-- adapted from Roberto
-- www.inf.puc-rio.br/~roberto/lpeg.html
function citator.split (s, sep)
  if type(sep) == "string" then
    sep = P(sep)
  end
  local elem = C((1 - sep)^0)
  local p = Ct(elem * (sep * elem)^0)
  return lpegmatch(p, s)
end
local split = citator.split

-- Return a list of authors' names from a string separated by "and".
local _p_spaces = S" \n\t\v"^1
local _p_and    = _p_spaces * P"and" * _p_spaces
function citator.get_author_list (rawaut)
    if not stringfind(rawaut, "and") then return { rawaut } end
    return split(rawaut, _p_and)
end
local get_author_list = citator.get_author_list

do
    local wl = P{
        [1] = "words",

        left  = P"{",  right = P"}",
        space = P" ",  tabs  = S"\v\t",
        eol   = P"\n", whitespace = V"space" + V"tabs" + V"eol",

        inbrace = V"left" * (1 - V"right")^1 * V"right",
        other = (1 - V"inbrace" - V"whitespace")^1,

        elm = V"inbrace" + V"other",

        words = Ct((V"whitespace"^0 * C(V"elm"))^0)
    }

    -- Takes a string and splits it into words, returning a list of words.
    function citator.get_word_list(s)
        return lpegmatch(wl, s)
    end
end
local get_word_list = citator.get_word_list

-- from http://osdir.com/ml/l...@bazar2.conectiva.com.br/2009-12/msg00910.html
do
    local space = S" \t\v\n"
    local nospace = 1 - space
    local ptrim = space^0 * C((space^0 * nospace^1)^0)
    function citator.strip (s)
        return lpegmatch(ptrim, s)
    end
end


-- Return the formatted author field for one author string.
function citator.reverse_one_author (rawaut, form)
    local         listaut = get_word_list(rawaut)
    local formaut, tmpaut = "", {}
    if (#listaut > 1) then
        for i,j in next, listaut do
            listaut[i] = citator.strip(j)
        end
        lastname = listaut[#listaut] .. ","
        tableremove(listaut, #listaut)
        tmpaut[#tmpaut+1] = lastname
        for i,j in next, listaut  do tmpaut[#tmpaut+1] = j end
        for i,j in next, tmpaut   do formaut = formaut .. " " .. j end
    else
        formaut = listaut[1]
    end
    return formaut
end
local reverse_one_author = citator.reverse_one_author

-- Take a string of authors' names rawaut and return a list that is built
-- according to the global citator.cite_inv_author.
-- <string> ‘resultformat’: if it has the value ‘string’ then the function will
-- return a string instead of a table.
function citator.format_author_list (rawaut, resultformat)
    warn("author list", rawaut)
    local max        = citator.compress_authors -- <int>, default=3
    local authorlist = get_author_list(rawaut)
    local cnt = 1
    local tmplist = {}
    local citestyle = citator.styles[citator.cite_style] or fancy2
    local etal      = citestyle.cite_etal_string
    repeat
        if cnt == 1 then
            if citator.cite_author_form == "allinv"   or
               citator.cite_author_form == "firstinv" then
                tmplist[#tmplist+1] = reverse_one_author(authorlist[cnt])
                warn("num: "..cnt, authorlist[cnt])
            else -- don’t reverse anything
                tmplist[#tmplist+1] = authorlist[cnt]
            end
        elseif cnt > max then
            tmplist[#tmplist+1] = etal
            break
        else
            warn("num: "..cnt, authorlist[cnt])
            if citator.cite_author_form == "allinv" then
                tmplist[#tmplist+1] = reverse_one_author(authorlist[cnt])
            elseif citator.cite_author_form == "firstinv" then
                tmplist[#tmplist+1] = citestyle.cite_author_separator
                tmplist[#tmplist+1] = authorlist[cnt]
            else
                tmplist[#tmplist+1] = citestyle.cite_author_separator
                tmplist[#tmplist+1] = authorlist[cnt]
            end
        end
        cnt = cnt + 1
    until authorlist[cnt] == nil
    warn(#tmplist, tmplist[1])
    if resultformat == "string" then
        return tableconcat(tmplist)
    end
    return tmplist
end
local format_author_list = citator.format_author_list
·······································································

As you can see, all I have to offer is spaghetti :P And the
formatting rules for names (the fields author, bookauthor,
translator, editor, bookeditor, commentator, etc. pp.) are by no
means everything that bibtex handles.

The hard part is the formatting of entries according to cite
style (apa etc.) and method (short, number, full). Then strings
(ibidem, et. al) need to respect i18n. Sorting of the bib has to
take place on a certain set of fields in a certain order
depending on whether the entry has an author field or only an
editor or both ... and then there is the problem with names in
general:
http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

I don’t want to be spreading pessimism, but these problems are
easily understimated.

Philipp




> 
> 
> Marco
> 
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the 
> Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

-- 
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Attachment: pgppI6OHkqCtc.pgp
Description: PGP signature

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

Reply via email to