Dear all,

I am developing a tool to extract a table from an image. It is a big
undertaking but I hope to release a beta version soon.

The input to the tool is a PNG/JPG/PDF image and output is a CSV/ODT/XLS
table.

I have some simple tables extracted from PDF. If there are formats which
govt uses often and people often need/want to digitize them, I'd like to
have some samples. I am thinking of census data, GIS data etc..

There is no plan to support multi-page tables. I can use some advice on the
OCR backend (I am using pytesseract from google for now).

best,
    Dilawar

--
Dilawar Singh, Ph.D.
LinkedIn <https://www.linkedin.com/in/dilawar-singh-ph-d-44b81b194/> ORCID
<https://orcid.org/0000-0002-4645-3211> Github <https://github.com/dilawar>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/CAM72-Zs9PT7CNZONjCUWM3%3D%3DiNDyfhVPg7Yhko1ALJ_Cmp25%2Bw%40mail.gmail.com.

Reply via email to