I am having issues using the getText method (ExtractText) functions as it
cats all text together.
I would like to go a step deeper and pull each COSString value and delimit
them.
Below is the code I am using thus far to get all text.
I am not
try {
PDFTextStripper pdfTextStripper = new PDFTextStripper();
doc = PDDocument.load( stream );
return (pdfTextStripper.getText(doc));
} finally {
quietlyClose(doc);
}
I noticed that the the logs show the operators and types. But some strings
are broken up into multiple COSString fields within arrays.
I would like to know what methods can I use to traverse/look all fields and
select the COStrings out.
Thanks