Thanks again all,

I love OpenRefine - I've been working on the GOKb project (http://gokb.org) 
where K-Int (a UK based company) have developed an extension for OpenRefine 
which helps with the cleaning of data about electronic resources (esp. 
journals) from publishers and then pushes it into the GOKb database. The 
extension is fully integrated into the GOKb database but if anyone wants a look 
code is at https://github.com/k-int/gokb-phase1/tree/dev/refine. The extension 
checks the data and reports errors as well as offering ways of fixing common 
issues - there's more on the wiki 
https://wiki.kuali.org/display/OLE/OpenRefine+How-Tos

I did pitch an OpenRefine workshop for the same event as a 'data 
wrangling/cleaning' tool but the 'automation' session got the vote in the end - 
although there is definitely overlap. However I am delivering an OpenRefine 
workshop at the British Library next week - and great to see it is getting used 
across libraries.

The Google Doc Spreadsheets is also a great tip - I've run a course at the 
British Library which uses this to introduce the concept of APIs to 
non-techies. I blogged the original tutorial at 
http://www.meanboyfriend.com/overdue_ideas/2013/02/introduction-to-apis/ but a 
change to the BL open data platform means this no longer works :((

Thanks all again - I'll be trying to put stuff from the automation workshop 
online at some point and I'll post here when there is something up.

Best wishes,

Owen


Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: [email protected]
Telephone: 0121 288 6936

On 8 Jul 2014, at 03:52, davesgonechina <[email protected]> wrote:

> +1 to OpenRefine. Some extensions, like RDF Refine <http://refine.deri.ie/>,
> currently only work with the old Google Refine (still available here
> <https://code.google.com/p/google-refine/>). There's a good deal of
> interesting projects for OpenRefine on GitHub and GitHub Gist.
> 
> Google Docs Spreadsheets also has a surprising amount of functionality,
> such as importXML if you're willing to get your hands dirty with regular
> expressions.
> 
> Dave
> 
> 
> On Tue, Jul 8, 2014 at 3:12 AM, Tillman, Ruth K. (GSFC-272.0)[CADENCE GROUP
> ASSOC] <[email protected]> wrote:
> 
>> Definite cosign on Open Refine. It's intuitive and spreadsheet-like enough
>> that a lot of people can understand it. You can do anything from
>> standardizing state names you get from a patron form to normalizing
>> metadata keywords for a database, so I think it'd be useful even for
>> non-techies.
>> 
>> Ruth Kitchin Tillman
>> Metadata Librarian, Cadence Group
>> NASA Goddard Space Flight Center Library, Code 272
>> Greenbelt, MD 20771
>> Goddard Library Repository: http://gsfcir.gsfc.nasa.gov/
>> 301.286.6246
>> 
>> 
>> -----Original Message-----
>> From: Code for Libraries [mailto:[email protected]] On Behalf Of
>> Terry Brady
>> Sent: Monday, July 07, 2014 1:35 PM
>> To: [email protected]
>> Subject: Re: [CODE4LIB] 'automation' tools
>> 
>> I learned about Open Refine <http://openrefine.org/> at the Code4Lib
>> conference, and it looks like it would be a great tool for normalizing
>> data.  I worked on a few projects in the past in which this would have been
>> very helpful.
>> 

Reply via email to