When I read whole document (.net code):
///////////////////////////////////////////////

java.io.StringWriter w = new java.io.StringWriter();

StreamWriter sw = new StreamWriter(fs);

PDFTextStripper stripper = new PDFTextStripper("UTF-8");

stripper.setSortByPosition(false);

stripper.setStartPage(0);

stripper.setEndPage(1000);

sw.Write(w.toString());

//////////////////////////////////////////

I see to many spaces inside and between words.



But if I read one page after another:

////////////////////////////////////////////////

for (int k = 1; k <= document.getPageCount(); ++k)

{

  stripper.setStartPage(k);

  stripper.setEndPage(k);

  stripper.writeText(document, w); 

}

///////////////////////////////////////////////////

thare are no additional spaces. 

It appears since pdf_box version 1.0 in some pdf-files.



Can you explain me why?

Thanks.

 Konstantin
  • Too many spaces in some do... Поникаровский Константин
    • Too many spaces in so... Поникаровский Константин

Reply via email to