Hello Scott,
I would be curious to hear more from what you expect from OpenRefine in that case. I know OpenRefine is powerful for many things but I can't get it for the current case, can you expand ?

Thanks
----
Sylvain Machefert - Bordeaux, France
Web services librarian - http://geobib.fr/en

Le 22/11/2014 19:44, scott bacon a écrit :
Erica,

You may find what you need from OpenRefine: http://openrefine.org/



On Fri, Nov 21, 2014 at 5:15 PM, Erica FINDLEY <eri...@multco.us> wrote:

Greetings,

I am working on a project to digitize concert programs. These are the type
of programs you get when attending a musical concert that list performers
and details about the concert.

Since these items are text heavy we have decided to use OCR software to
output a text file that will enable full text searching in our platform.

These text files are for the most part accurate, but often have unnecessary
line breaks and pockets of extra characters and/or incorrect
capitalization. I would like to pretty them up a little bit if possible.

I am wondering if there is a script I can use on multiple files to clean
these type of things up. I don't want to have the digitization staff
manually edit each text file or have to open each one to run a macro in a
text editor.

I have been searching online and so far haven't found anything that will
work for my situation.

thanks in advance,

*Erica Findley*
Cataloging/Metadata Librarian
Multnomah County Library
Phone: 503.988.5466
eri...@multco.us
www.multcolib.org

Reply via email to