PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com _____________________________________________________________
At 5:46 PM -0500 12/23/03, Palanivelu, Sridharan wrote:
I am trying to get wordlist from a PDF document and I am using PDWordFinder API's to acquire the word list(I have passed NULL to "outEnInfo" parameter for PDDocCreateWordFinder function).
Sounds good...
I am getting 3 three word's for the string "05/01/2003" in the PDF file. First word "05/", second word "01/", and third word "2003".
that is correct using the options you provided...
Is there any way to get the whole string as one single word?Change the word breaking algorithm/tables used by PDWordFinder...
OR if you are in Acrobat 6, you can use the new functions for getting the entire text of a page and then apply your own word tests...
Leonard -- --------------------------------------------------------------------------- Leonard Rosenthol <mailto:[EMAIL PROTECTED]> Chief Technical Officer <http://www.pdfsages.com> PDF Sages, Inc. 215-629-3700 (voice) 215-629-0789 (fax)
To change your subscription: http://www.pdfzone.com/discussions/lists-pdfdev.html
