Kevin, Shujaat,

Kevin Day wrote
> Bear in mind that there is no "X-coordinate", per-se.  But I can provide a
> float array with the offset of each glyph along the baseline vector.

I think Shujaat thinks very much in terms of his use case which does not
include any rotations (neither as /Rotate entry nor as affine transformation
type); therefore, an array of "X coordinates" indeed exists for him and is
sufficient.

Kevin Day wrote
> I'd like to get Michael's thoughts on this as well (it sounds like he has
> some other ideas on how this might be used).  His idea of returning an
> array or list of sub-TextRenderInfo objects has some appeal (it's
> certainly more object oriented, and may be easier to use overall).  But I
> don't want to make things more complicated than they have to be, either.

Well, the "other ideas on how this might be used" are derived from other use
cases presented here on the list, especially the "marked for redaction" one
which essentially is about determining which characters exactly are inside a
given area, not merely which TextRenderInfo strings. If I had to program an
ExactLocationBasedExtractionStrategy, I would not be happy to merely get
those offsets.

I think that as soon as you have to consider any rotations, you generally
are way better off if you get more information, at least the baseline
segment, even better the whole character box. This in general even prevents
unnecessary calculations.

The "sub-TextRenderInfo objects" mentioned can be lazily initializing
instances of inner classes of the TextRenderInfo which limits unneeded
calculations. You could even use some cursor pattern and so also strictly
limit memory requirements (which wouldn't be that big anyway...).

Regards,   Mikel.

PS: My thoughts on this are not driven by actual requirements I do have in
some project here. I don't think, though, that in case of such projects I
generally can require my customer to limit his inputs to my program to PDFs
without any rotation.



--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/PdfContentStreamProcessor-not-handling-TJ-operator-correctly-maybe-tp4656117p4656370.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to