Problems with Java PDFBox

Natalia Gómez García Sun, 09 Sep 2012 03:54:52 -0700

Hello,

I am a computer science student and I'm using your library PDFBox in Java
to extract text data from some pdf files.


In this project, I am having difficulties extracting the text from this
pdf: http://www.escet.urjc.es/alumnos/horarios/GR_Biologia_2012-13.pdf.
Specifically, I can't get to extract the text "Semana del 3 al 7 de
Septiembre de 2012".

Why can this be happening? Could you please give me some directions on how
to extract this data?

The code I'm using right now is the following:
pdfDoc = PDDocument.load(url);
pdfStripper = new PDFTextStripper();
texto=pdfStripper.getText(pdfDoc);
pdfDoc.close();

Thanks for your attention
Natalia

Problems with Java PDFBox

Reply via email to