[ 
https://issues.apache.org/jira/browse/PDFBOX-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436098#comment-17436098
 ] 

Chris Newhouse commented on PDFBOX-5297:
----------------------------------------

OK, great, thanks! Inspired by this code/fix we've also come across another bad 
file where `daBase` was an instance of `COSString`, but its value was 
`/Helvetica blah blah blah`. The above code didn't fix it. So I modified it to 
be something like the following, and it fixed things. Does this look reasonable?
{code:java}
for (PDField field : acroForm.getFieldTree()) {
  if (field instanceof PDVariableText) {
    String da;
    
    // If it was a COSName, it really ought to result in a setDefaultAppearance 
call, per the comments
    // in the PDFBOX issue above. Otherwise, we can probably avoid the call to 
setDefaultAppearance if
    // it was already a COSString AND we didn't knowingly fix anything.
    boolean wasCosName = false;

    COSBase daBase = field.getCOSObject().getDictionaryObject(COSName.DA);

    if (daBase instanceof COSName) {
      da = ((COSName) daBase).getName();
      wasCosName = true;
    } else if (daBase instanceof COSString) {
      da = ((COSString) daBase).getString();
    } else {
      // Can't do anything about this one...
      continue;
    }

    boolean wasFixedThisRound = false;

    if (da.startsWith("Helvetica")) {
      da = da.replace("Helvetica", "/Helv");
      wasFixedThisRound = true;
    }
    else if (da.startsWith("/Helvetica")) {
      da = da.replace("/Helvetica", "/Helv");
      wasFixedThisRound = true;
    }
    else if (da.startsWith("ZapfDingbatsITC")) {
      da = da.replace("ZapfDingbatsITC", "/ZaDb");
      wasFixedThisRound = true;
    }
    else if (da.startsWith("/ZapfDingbatsITC")) {
      da = da.replace("/ZapfDingbatsITC", "/ZaDb");
      wasFixedThisRound = true;
    }    
    else if (!da.startsWith("/")) {
      // Log here anything that doesn't start with `/`
    }

    if (wasFixedThisRound || wasCosName) {
      ((PDVariableText) field).setDefaultAppearance(da);
    }
  }
}
{code}

> class org.apache.pdfbox.cos.COSName cannot be cast to class 
> org.apache.pdfbox.cos.COSString
> -------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-5297
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5297
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm
>    Affects Versions: 2.0.24
>            Reporter: Chris Newhouse
>            Priority: Major
>
> A customer provided us with a PDF that contains an AcroForm and has some of 
> the data filled in. There are various ways to trigger the error, but here's a 
> stacktrace:
> {code:java}
> class org.apache.pdfbox.cos.COSName cannot be cast to class 
> org.apache.pdfbox.cos.COSString (org.apache.pdfbox.cos.COSName and 
> org.apache.pdfbox.cos.COSString are in unnamed module of loader 'app')
>  at 
> org.apache.pdfbox.pdmodel.interactive.form.PDVariableText.getDefaultAppearanceString(PDVariableText.java:91)
>  at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.<init>(AppearanceGeneratorHelper.java:114)
>  at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>  at 
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm.refreshAppearances(PDAcroForm.java:331)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566){code}
> The PDF contains sensitive user information, so I cannot post it here 
> publicly, but I'd be willing to submit it to a private upload area. When I 
> use an editor to remove/change the sensitive data, the problem goes away or 
> sprouts up as a different error (related to fonts).
>  
> Here is a little bit of metadata I can provide right now:
> {code:java}
> {
>  "Author": "SE:W:CAR:MP",
>  "CreationDate": "D:20211012165530Z00'00'",
>  "Creator": "Adobe LiveCycle Designer ES 9.0",
>  "Keywords": "Fillable",
>  "ModDate": "D:20211012165530Z00'00'",
>  "Producer": "macOS Version 10.15.7 (Build 19H1417) Quartz PDFContext",
>  "Subject": "Request for Taxpayer Identification Number and Certification",
>  "Title": "Form W-9 (Rev. October 2018)"
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to