Re: [CODE4LIB] Examples of Web Service APIs in Academic & Public Libraries

Jonathan Rochkind Wed, 19 Oct 2011 06:58:19 -0700

If someone else were getting started and didn't want to assemble theirown training data -- do you think it would be likely useful for them toaggregate your training data _and_ Brown's training data together andgenerate a new model? Was there a particular reason you chose not touse Brown's training data and add on it to it, but start over from scratch?

Forgive me if this is a stupid question, I'm still trying to learn aboutthis stuff.

And start to figure out how I'm going to deal with it when I get aroundto using FreeCite, which I surely will. Would it maybe make sense toactually seperate the training data and trained model in a seperatelibrary, so people could even pick and choose what already built trainedmodel they want to use, or build their own, without dealing with repoconflicts?

The training data is not currently under source control (it's in thedatabase), but the trained model is.

That's, admittedly, a bit of a downside to my fork (although the modelbeing checked into git is true of the original, as well) since you'dalways be in conflict with my trained model if you train your own.


-Ross.

On Monday, October 17, 2011, Jonathan Rochkind <[email protected]<mailto:[email protected]>> wrote:> When you say you've added to the training data, have you shared youradditions back with Brown, or your new improved training data is onlyin your fork? Or is only held locally by you and isn't even in yourgithub fork? Please clarify, thanks!

>
> On 10/13/2011 8:52 PM, Ross Singer wrote:
>>
>> Yeah, we've been doing a lot with (and putting a lot of updates into)
>> FreeCite.  We only use the webservice (although we don't use the
>> OpenURL context object and instead added a JSON response).  It works
>> pretty well (not always great, but certainly better than nothing) -
>> especially for giving us something "good enough" to throw against some
>> OpenLibrary and Crossref data to look for matches.  Basically what
>> we're using it for is to go from a citation string to an RDF graph.
>>
>> BTW, there have been no problems with post-2000 dates (not to say that
>> there aren't plenty of other problems) - this might have been either a
>> training issue or something a later version of CRF++ worked out.  We
>> also add the citations it couldn't parse correctly to its training
>> data, which might help this.
>>
>> Anyway, yeah, if anybody is interested, feel free to try it out.  One
>> thing my fork does is remove the PostgreSQL dependency, if that's an
>> issue for anybody.  It's kind of handy to be able to just use SQLite
>> or MySQL or whatever to try it out.
>>
>> -Ross.
>>

>> On Thu, Oct 13, 2011 at 7:42 PM, Avram Lyon<[email protected]<mailto:[email protected]>> wrote:

>>>

>>> On Thu, Oct 13, 2011 at 2:33 PM, Will Kurt<[email protected]<mailto:[email protected]>> wrote:

>>>>
>>>> I always think that Brown's FreeCite api is under utilized.
>>>> http://freecite.library.brown.edu/
>>>> It's far from perfect, but I'm sure more use could be made of it.
>>>>

>>>> A few months back I threw together a copy/paste citation look-upwith it:

>>>> CiteBox
>>>> http://willkurt.github.com/CiteBox/
>>>>
>>>> Of course I don't think anyone is really making use of it, but I've
>>>> also done nothing to really promote it either ;)
>>>
>>> The FreeCite parser had major issues for a while with post-2000 dates,
>>> and I believe the installation at Brown still does, but, to judge by
>>> the GitHub activity (most active fork here:
>>> https://github.com/rsinger/free_cite/), some enterprising folks have
>>> picked it up after a period of apparent dormancy. This is great to
>>> see, and vital to any project that hopes to use its API for anything
>>> serious.
>>>
>>> By the way, the rarely-used XML representation of OpenURL
>>> ContextObjects that FreeCite produces is supported by Zotero as a
>>> full-fledged input format, a fact that might come in handy if you're
>>> hoping to have your API produce something that Zotero users can
>>> import.
>>>
>>> Avram
>>>
>>> UCLA Slavic, Zotero community dev
>>>

>

Re: [CODE4LIB] Examples of Web Service APIs in Academic & Public Libraries

Reply via email to