Hi,

I am trying to extract text from pdf, and process the text. I have been
successful in extraction, but could not get much benefits out of it as the
extracted text treated the superscripts, usually numbers, as normal text.

A superscript to a word, which is the last word of a sentence, has been
placed after the period(.)

ex: Word: "test" with superscript "super"
When it appeared at the end of a sentence, has been extracted as -
"test.super"

Is there any way I can get rid of superscripts?

-- 
Br,
Siva.

Reply via email to