On Jul 18, 2007, at 10:04 PM, Eric Hellman wrote:
Also, even in (many) scholarly journals, editorial consistency is
almost unbelievably poor -- lots of times, the rules just aren't
followed. Punctuation gets missed, journal names (especially
abbreviations!) are misspelled... and so on. Rule-based
On 7/20/07, Eric Hellman [EMAIL PROTECTED] wrote:
Have people been able to do a decent job of identifying parts of
speech in natural language?
I think trying to import broad NLP findings into our narrower problem of
citation parsing is not likely to be fruitful but on the other hand
Godmar Back wrote:
A year or so ago a couple of students looked into this for LibX. There
are a number of systems that people have published about, although
some are not available and none worked very well or were easy to get
to work. The systems also varied in their computational complexity,
Nice, that might be what I need. Maybe I'll take a look at the LibX
code, it's open source, right?
Google Scholar has no API--you're screen scraping it?
Jonathan
Godmar Back wrote:
A year or so ago a couple of students looked into this for LibX. There
are a number of systems that people have
On 7/18/07, Steve Toub [EMAIL PROTECTED] wrote:
Agreed that a lookup against something like Google Scholar, Web of
Science, or a set of federated search targets instance may yield better
results. We've discussed by haven't done any testing.
Use your LibX edition, Steve. I can also send a
On 7/18/07, Jonathan Rochkind [EMAIL PROTECTED] wrote:
Nice, that might be what I need. Maybe I'll take a look at the LibX
code, it's open source, right?
Google Scholar has no API--you're screen scraping it?
Yes and yes.
- Godmar
Hi Jonathan,
There is a PERL module by Mike Jewell which was written for this purpose:
http://search.cpan.org/~mjewell/Biblio-Citation-Parser-1.10/
I am not using the code, so I can't comment on how well it may work for
your purpose, but it's probably worth a look.
-- Alberto
On 7/17/07,
Ha! If it's not too difficult, then with all the time you've spent
looking at it extensively, how come you don't have a solution yet?
Just kidding. :)
Jonathan
Nathan Vack wrote:
We've looked at this pretty extensively, and we're pretty certain
there's nothing downloadable that does a good
Having written a pretty decent citation parser 10 years ago (in
Applescript!), and having seen a lot of people take whacks at the
problem, I have to say that it's pretty easy to write one that works
on 70-80% of citations, particularly if you stick to one scholarly
subject area. On the other
Does anyone have any decent open source code to parse a citation? I'm
talking about a completely narrative citation like someone might
cut-and-paste from a bibliography or web page. I realize there are a
number of differnet formats this could be in (not to mention the human
error problems that
10 matches
Mail list logo