Hi Yoni, this is an XFA document. The french text is part of a javascript message.
ExtractText works on PDF text objects that's why you don't get the XFA forms part. BR Maruan Sahyoun Am 11.06.2013 um 13:45 schrieb Yoni Amir <[email protected]>: > Hello, > I have a pdf document (link here: > https://www.dropbox.com/s/vr2xi5cf0uzur69/TEST_POC_DS_01.pdf). > > I think it is an XFA document, although I am not 100% sure how to verify > this. So I apologize in advance if this question is misdirected. > When I run the sample ExtractText class on this file, I am not receiving the > actual text in the pdf. Rather, I receive the generic text hidden in the pdf > similar to this: > > "Please wait... If this message is not eventually replaced by the proper > contents of the document, your PDF > viewer may not be able to display this type of document." > > The text returned also contains stuff in French (link here: > https://www.dropbox.com/s/oor4gj7wbhue8yc/TEST_POC_DS_01.txt) > but I haven't figured out what it is. It is not text that is visible in the > PDF file. > > Thanks, > Yoni > > Confidentiality: This communication and any attachments are intended for the > above-named persons only and may be confidential and/or legally privileged. > Any opinions expressed in this communication are not necessarily those of > NICE Actimize. If this communication has come to you in error you must take > no action based on it, nor must you copy or show it to anyone; please > delete/destroy and inform the sender by e-mail immediately. > Monitoring: NICE Actimize may monitor incoming and outgoing e-mails. > Viruses: Although we have taken steps toward ensuring that this e-mail and > attachments are free from any virus, we advise that in keeping with good > computing practice the recipient should ensure they are actually virus free.

