Yes; it does seem I started with the most difficult problem all while 
just picking up iText & the book.
I had the .jar but I think its time for me to download the iText source.

Thank you for pointing me there. It does seem that's the best place to 
start and it seems I would need to rewrite/extend PRTokenizer or 
something similar.

I am aware of some of what is involved. I wrote a program that found 
rects where words were probably hiding in a drag rectangle.
It didnʻt do such a good job; Adobe Acrobat Pro does a much better job 
and itʻll figure out some text for me too.
If I can only crack this puzzle with the detection I think I'll be fine.

Thanks for the help!

kb


On 7/16/12 10:04 AM, Kevin Day wrote:
> The text parser would be the best place for you to start looking.  It
> determines the rectangles for each text draw operation (which is not the
> same as what you are asking, but it's a starting point at least).  What you
> are asking for is very difficult to do with PDF in the general case because
> PDF doesn't have a concept of words, but that will get you a starting point
> so you'll at least understand what is involved.  Pay particular attention to
> the part of the algorithm that figures out spaces between words.
>
> Good luck!
>
> --
> View this message in context: 
> http://itext-general.2136553.n4.nabble.com/Calculating-text-regions-of-individual-words-from-an-existing-PDF-tp4655616p4655622.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> iText-questions mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a 
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples: 
> http://itextpdf.com/themes/keywords.php
>
>




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to