Re: PDFBox 1.1.0 release plan (Was: Next release & merging of trunks)

Timo Boehme Fri, 19 Mar 2010 09:23:02 -0700

>>>> Andreas has looked at PDFBOX-624 lately. The patch looked somewhat
>>>> suspicious to him, because it employs simple Java casting instead of
>>>> more complex bit manipulation. Any more experts here?
>>> I puzzled about the specification:
>>>
>>> The chapter 3.2 "Charstring Number Encoding" contains the following
>> statement:
>>> "If the charstring byte contains the value 255, the next four bytes
>>> indicate a two's complement signed number. The first of these
>>> four bytes contains the highest order bits, the second byte
>>> contains the next higher order bits and the fourth byte contains
>>> the lowest order bits. This number is interpreted as a Fixed; that
>>> is, a signed number with 16 bits of fraction." 
>>>
>>> AFAIU the spec, we have to use a combination of byte 1, 2 and 4. As it is
>>> a two's complement, we have to build the one complement. And after that
>>> we have to cut that result to a 16bit value. I've tried to implement some
>> code
>>> to follow that path, but it didn't work yet. Probably I'm just wrong with
>> my
>>> interpretation of the spec.
>>>
>>> Villus implementation only uses byte 1 and 2. Looking at the spec it
>> seems
>>> to be insufficient, but it works. So, at least it is a workaround.


I have written a CFF parser with a converter to Type1. From this I
copied the following lines which handle the 5 byte charstring number
encoding:

int val;
// ---- b0 == 255; 5 byte operand; signed fixpoint - lower 16 bit fraction;
// ---- since Type1 has no real type we ignore fraction here !
// ---- At least we warn if there is a fraction.
val  = _rawIn.read() << 24;
val |= _rawIn.read() << 16;
val |= _rawIn.read() << 8;
val |= _rawIn.read();
                                
if ( ( val & 0xffff ) != 0 ) {
        if ( logger.isLoggable( Level.FINE) )
                logger.fine( "Fraction will be rounded: " + ( ( val & 0xffff ) /
(float) 0x10000 ) );
        if ( ( val & 0x8000 ) != 0 ) {
                // highest fraction bit set - add 1 (should also be ok for 
negative
values)
                cmd.args.add( ( val >> 16 ) + 1 );
                                                
        } else {
                // fraction < 0.5 - ignore
                cmd.args.add( val >> 16 );
        }
} else
        cmd.args.add( val >> 16 );
}

As you can see I first read all bytes with the 2 integer bytes at high
order bits thus negative values are handled correctly (same two's
complement structure as in Java). Now I test if fraction bits are set.
Since in Type1 there is no number type 'real' I can't use fraction but
try to round it. In the end the integer part is shifted by two bytes
(sign bit remains unchanged).
You are welcome to adapt this code to your implementation. I don't have
the time yet but could do it in the next days.

-- Timo

 Timo Boehme
 OntoChem GmbH
 H.-Damerow-Str. 4
 06120 Halle/Saale
 T: +49 345 4780472
 F: +49 345 4780471
 [email protected]

_____________________________________________________________________

 OntoChem GmbH
 Geschäftsführer: Dr. Lutz Weber
 Sitz: Halle / Saale
 Registergericht: Stendal
 Registernummer: HRB 215461
_____________________________________________________________________

Re: PDFBox 1.1.0 release plan (Was: Next release &amp; merging of trunks)

Reply via email to

Re: PDFBox 1.1.0 release plan (Was: Next release & merging of trunks)