Hi Shiv, Do you mind sharing a couple of sample pdfs? Do they contain structured data like tables or some other type of data?
On Tuesday, April 9, 2024 at 11:33:28 PM UTC+5:30 Shiv Hastawala wrote: > Hi data enthusiasts > > I have a lot of publicly available data which are pdf scans of old > publications. I wish to digitize them as a public service. I found that the > following python package is pretty efficient at doing this job: > > https://layout-parser.readthedocs.io/en/latest/ > > > However, since I am python-illiterate, I was wondering if any of you > python enthusiasts would be interested in writing the code for this > project? Obviously, this is voluntary work. > > Please reply to me personally if you are interested. Thanks! > > Thanks and regards. > > > Yours sincerely > > *Shiv Hastawala* > > (He/His/Him) > Doctoral Candidate > Department of Economics > Binghamton University (State University of New York) > > Email ID: shastaw1[at]binghamton[dot]edu > > Zoom ID: 201 717 2613 <(201)%20717-2613> > > www.shivhastawala.com > -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/datameet/52428962-6dc3-4bdf-b82d-25e7f4ecdcfcn%40googlegroups.com.