[datameet] textricator | Generate structured data from PDFs

Dilawar Singh Sun, 19 Jul 2020 22:55:11 -0700

Found this tool today. Can help you getting data from PDFs.

*Textricator* is a tool to extract text from documents and generate 
structured data.


If you have a bunch of PDFs with the same format (or one big, consistently 
formatted PDF) and you want to extract the data to CSV or JSON, 
*Textricator* can help! It can even work on OCR'ed documents!
https://github.com/measuresforjustice/textricator

best,
   Dilawar

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/cf8a3071-700f-4197-a86b-db698c910f80n%40googlegroups.com.

[datameet] textricator | Generate structured data from PDFs

Reply via email to