PDFName.getName() returns escaped name?!

Craig Ringer Wed, 28 Mar 2012 22:56:39 -0700

Hi all

I've been working with PDFName in my code and have run into a bit of anoddity I was hoping for comments on.


For any given string `fred', the operation:

   ( new PDFName(fred) ).getName().equals(fred)

isn't guaranteed to be true, because PDFName.getName() returns the*escaped* name. It strips the leading slash added byPDFName.escapeName(), so most of the time the returned name will be thesame, but it's a good candidate for creating exciting bugs.

I'd like to be able to use PDFName instead of String as a map key (forclarity, mostly), but need to be able to get the original encapsulatedstring quickly (without decoding) and reliably.

I'd like to change PDFName so that it keeps a reference to the originalname string and returns that from getName(). It should encode the nameon the first call to the new getEncodedName() method, storing it in alocal member, so short-lived PDFName objects don't waste time encodingstrings. I'd also like to have getEncodedName() return a byte[] not aString, since an encoded PDF name isn't actually text data.

BTW, is there any reason Fop's PDF library uses java.lang.String whenworking with sequences of PDF data bytes? For example, the output ofPDFName.escapeName(...) isn't really a "string" at all, in that it's notmeaningful text in any encoding, it's just a byte sequence jammed intothe lower 8 bits of unicode code points. It's pretty confusing having itas a String (logically an array of unicode characters) rather than as abyte[]. Right now, fop also writes 8-bit characters in names incorrectly- the toHex(...) and PDFName.escapeName(...) methods translate valuesbetween 128 and 255 inclusive of each *unicode* *character* in a Stringto hex and write that out. This is incorrect, because PDF names shouldbe UTF-8, so it should be encoding to a UTF-8 byte sequence then escaping.


--
Craig Ringer

POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088     Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

PDFName.getName() returns escaped name?!

Reply via email to