We've looked at this pretty extensively, and we're pretty certain
there's nothing downloadable that does a "good enough" job. However,
it's by no means impossible -- it seems to be undergrad thesis-level
work in Singapore:

http://wing.comp.nus.edu.sg/parsCit/

There used to be a paper describing this approach (essentially
treating citation parsing as a natural language processing task and
using a maximum entropy algorithm) online... the page even cites
it... but it seems to be gone now.

FWIW, it didn't look too difficult.

-Nate

On Jul 17, 2007, at 6:16 PM, Jonathan Rochkind wrote:

Does anyone have any decent open source code to parse a citation? I'm
talking about a completely narrative citation like someone might
cut-and-paste from a bibliography or web page. I realize there are a
number of differnet formats this could be in (not to mention the human
error problems that always occur from human entered free text)--but
thinking about it, I suspect that with some work you could get
something
that worked reasonably well (if not perfect). So I'm wondering if
anyone
has donethis work.

(One of the commerical legal product--I forget if it's Lexis or
West--does this with legal citations--a more limited domain--quite
well.  I'm not sure if any of the commerical bibliographic citation
management software does this?)

The goal, as you can probably guess, is a box that the user can
paste a
citation into; make an OpenURL out of it; show the user where to
get the
citation.  I'm pretty confident something useful could be created
here,
with enough time put into it. But saldy, it's probably more time than
anyone has individually. Unless someone's done it already?

Hopefully,
Jonathan

Reply via email to