Tabula is pretty miraculous in turning hamburgers to cows but scanned from the 70s is a lot to ask. Still, I would try it. https://tabula.technology/
Christina -----Original Message----- From: Code for Libraries <[email protected]> On Behalf Of Haitz, Lisa (haitzlm) Sent: Tuesday, June 21, 2022 3:02 PM To: [email protected] Subject: [EXT] Re: [CODE4LIB] Converting old tables in PDF to CSV Acrobat (full version) has an export to excel function. I’ve used it before and my table data was exported correctly as each value was in an excel cell. 😊 From: Code for Libraries <[email protected]> on behalf of Matt Sherman <[email protected]> Date: Tuesday, June 21, 2022 at 2:53 PM To: [email protected] <[email protected]> Subject: Re: [CODE4LIB] Converting old tables in PDF to CSV External Email: Use Caution Hm, that should be doable, but an annoying amount of work. I haven't done it with tables but I have done it with bibliographic records and regex. Helps if there is a very consistent structure to the OCR. On Tue, Jun 21, 2022 at 1:47 PM Medina-Smith, Andrea M. (Fed) < [email protected]> wrote: > Hello List, > > Has anyone had success converting tables in a PDF to CSV? These are > scans of paper from the 70s on forward. I know this isn’t a super easy > conversion, but I would think it’s not impossible either. > > Thanks, > Andrea > > -- > > Andrea Medina-Smith > Data Librarian > Information Services Office > National Institute of Standards and Technology > [email protected]<mailto:[email protected]> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Forci > d.org%2F0000-0002-1217-701X&data=05%7C01%7Chaitzlm%40UCMAIL.UC.EDU > %7C91ae208fd9fd4122494608da53b7446c%7Cf5222e6c5fc648eb8f0373db18203b63%7C1%7C0%7C637914343836515265%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hrrJHUMhxJJ7I4A8bd9lMVqrkuZskwuBy6MtSc0ISaY%3D&reserved=0 > > >
