[ 
https://issues.apache.org/jira/browse/PDFBOX-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053431#comment-15053431
 ] 

John Hewson commented on PDFBOX-3152:
-------------------------------------

I've improved the inversion of Encoding vectors, so that it happens earlier, 
allowing us to not clobber the collisions, because it's ok to use them for the 
inverted map.

The root cause of the NPE is that we weren't checking if the Encoding in use 
actually contains the character before we tried to use it. I've added a check 
for this and added some extra code to produce nice error messages. The user is 
using a Type 1 font with WinAnsiEncoding and is trying to access a character 
outside of that encoding - which obviously won't work. We now produce a nice 
error message "IllegalArgumentException: U+00B7 is not available in this font's 
encoding: WinAnsiEncoding".

> NullPointerException in PDType1Font.encode() with centered dot
> --------------------------------------------------------------
>
>                 Key: PDFBOX-3152
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3152
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>         Environment: 2.0.0 RC2
>            Reporter: Tilman Hausherr
>            Assignee: John Hewson
>             Fix For: 2.0.0
>
>
> As reported by Pedro L.M. in the user mailing list
> Font used: 
> PDType1Font.HELVETICA
> code:{code}content.showText("sol·licitud");{code}
> {code}
> Exception in thread "main" java.lang.NullPointerException
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:357)
>       at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:285)
>       at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:482)
> {code}
> That character (B7) is listed as "middot" and "periodcentered" in 
> {{glyphlist.txt}}, and "middot" is returned. But "middot" doesn't exist in 
> the inverted list:
> {code}
> {Udieresis=220, tilde=152, greater=62, Ograve=210, bracketright=93, eth=240, 
> otilde=245, asciitilde=126, odieresis=246, Agrave=192, ograve=242, 
> parenright=41, brokenbar=166, bracketleft=91, Idieresis=207, copyright=169, 
> aring=229, Egrave=200, Aring=197, trademark=153, idieresis=239, D=68, E=69, 
> F=70, G=71, A=65, B=66, C=67, L=76, M=77, N=78, O=79, florin=131, H=72, 
> Acircumflex=194, I=73, J=74, six=54, K=75, currency=164, U=85, T=84, W=87, 
> adieresis=228, V=86, Q=81, P=80, S=83, R=82, quotesingle=39, Y=89, Eth=208, 
> X=88, braceright=125, Z=90, f=102, g=103, d=100, Ccedilla=199, e=101, b=98, 
> c=99, a=97, exclamdown=161, thorn=254, n=110, perthousand=137, o=111, 
> divide=247, acircumflex=226, l=108, m=109, j=106, Odieresis=214, k=107, 
> h=104, i=105, w=119, v=118, Ugrave=217, u=117, igrave=236, t=116, s=115, 
> r=114, yen=165, q=113, p=112, Yacute=221, quotedblleft=147, z=122, y=121, 
> bullet=127, threequarters=190, edieresis=235, x=120, atilde=227, percent=37, 
> Atilde=195, exclam=33, braceleft=123, numbersign=35, ntilde=241, endash=150, 
> scaron=154, comma=44, threesuperior=179, grave=96, period=46, AE=198, 
> ordfeminine=170, Otilde=213, hyphen=45, logicalnot=172, ocircumflex=244, 
> colon=58, acute=180, backslash=92, ordmasculine=186, four=52, question=63, 
> Scaron=138, cent=162, Oacute=211, agrave=224, five=53, at=64, onehalf=189, 
> equal=61, multiply=215, Ucircumflex=219, oslash=248, asciicircum=94, 
> Icircumflex=206, seven=55, semicolon=59, registered=174, cedilla=184, ae=230, 
> macron=175, paragraph=182, guilsinglright=155, nine=57, dagger=134, 
> ellipsis=133, zero=48, oacute=243, circumflex=136, oe=156, uacute=250, 
> guillemotright=187, ccedilla=231, sterling=163, plus=43, onequarter=188, 
> germandbls=223, Adieresis=196, questiondown=191, Thorn=222, Oslash=216, 
> bar=124, Uacute=218, onesuperior=185, guilsinglleft=139, space=32, 
> Zcaron=142, Euro=128, yacute=253, quotedbl=34, quotesinglbase=130, 
> eacute=233, OE=140, ugrave=249, dollar=36, dieresis=168, ydieresis=255, 
> Edieresis=203, slash=47, egrave=232, mu=181, twosuperior=178, 
> ucircumflex=251, ampersand=38, degree=176, Eacute=201, udieresis=252, 
> Igrave=204, zcaron=158, eight=56, quotedblright=148, daggerdbl=135, 
> icircumflex=238, quoteright=146, three=51, quotedblbase=132, underscore=95, 
> Ecircumflex=202, ecircumflex=234, periodcentered=183, iacute=237, 
> guillemotleft=171, emdash=151, one=49, Iacute=205, Aacute=193, less=60, 
> Ocircumflex=212, asterisk=42, parenleft=40, Ntilde=209, section=167, two=50, 
> Ydieresis=159, plusminus=177, quoteleft=145, aacute=225}
> {code}
> "middot" isn't mentioned anywhere in the PDF specification, but 
> "periodcentered" is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to