Hi, I'm parsing a word document using Apache POI.
The problem I have right now is that after parsing, the resulting String
(I'm using Java) still has some special characters.
A couple examples:

<TitreType>DRAFT REPORT</TitreType>

<RefProcLect>***I</RefProcLect>

So I'm not sure how to remove this because if I take let's say everything
that is in a <> and remove it then I might end up removing parts of the real
document.
Is there a way to remove only the special characters added by word?



--
View this message in context: 
http://apache-poi.1045710.n5.nabble.com/removing-hidden-characters-tp5721564.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to