Thanks I will look into them. On 9 February 2016 at 03:56, Han, Yan - (yhan) <y...@email.arizona.edu> wrote:
> Yes. Use iText or PDFBox > > These are common PDF libraries. > > > > > > On 2/6/16, 2:24 PM, "Code for Libraries on behalf of Andrew Cunningham" < > CODE4LIB@LISTSERV.ND.EDU on behalf of lang.supp...@gmail.com> wrote: > > >Hi all, > > > >I am working with PDF files in some South Asian and South East Asian > >languages. Each PDF has ActualText added for each tag in the PDF. Each PDF > >has ActualText as an alternative forvthe visible text layer in the PDF. > > > >Is anyone aware of tools the will allow me to index and search PDFs based > >on the ActualText content rather than the visible text layers in the PDF? > > > >Andrew > > > >-- > >Andrew Cunningham > >lang.supp...@gmail.com > -- Andrew Cunningham lang.supp...@gmail.com