On Tue, Mar 8, 2016 at 2:34 PM, Jeroen Ooms <jeroen.o...@stat.ucla.edu> wrote:
> When extracting text from a landscape pdf file using the cpp
> interface, text at the far right of the page does not get extracted .I
> think the problem is that page.text() always assumes portrait
> orientation and hence underestimates the width of the page:
>
>   p->text()
>   p->text(p->page_rect())
>
> Is this expected? What is the best way to extract all text from the
> page, irrespective of size and orientation?
>
> An example landscape pdf is here:
> https://github.com/ropensci/pdftools/files/161587/waurika_news_democrat.pdf

I would still be very interested in a fix or workaround for this
problem. I tried looking through the source but I don't understand it
well enough to figure out what is going wrong here. All help would be
really appreciated.
_______________________________________________
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to