Dear all, I am developing a tool to extract a table from an image. It is a big undertaking but I hope to release a beta version soon.
The input to the tool is a PNG/JPG/PDF image and output is a CSV/ODT/XLS table. I have some simple tables extracted from PDF. If there are formats which govt uses often and people often need/want to digitize them, I'd like to have some samples. I am thinking of census data, GIS data etc.. There is no plan to support multi-page tables. I can use some advice on the OCR backend (I am using pytesseract from google for now). best, Dilawar -- Dilawar Singh, Ph.D. LinkedIn <https://www.linkedin.com/in/dilawar-singh-ph-d-44b81b194/> ORCID <https://orcid.org/0000-0002-4645-3211> Github <https://github.com/dilawar> -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/datameet/CAM72-Zs9PT7CNZONjCUWM3%3D%3DiNDyfhVPg7Yhko1ALJ_Cmp25%2Bw%40mail.gmail.com.