I am analysing running text by trapping the output of PDFBox through org.apache.pdfbox.util.TextPosition through a subclass of org.apache.pdfbox.pdfviewer.PageDrawer. I notice that there are explicit characters for spaces (char 32). Sometimes there can be repeated spaces and even a "paragraph" consisting only of a space. I was unaware that PDF supported spaces - are these coming from the original document or are they generated in PDFBox from calculations of character spacing and width?
TIA for help. P. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069

