What, if at all possible, is the preferred way to determine if a document 
(namely a pdf) is of "binary nature"?

I am extracting text of many pdf user manuals for lucene indexing and some of 
them deliver "absurd binary terms", which I would like 
to omit

Thx
Clemens

Reply via email to