I would think supporting the following PDF 2.0 features are highly relevant, given that other implementations are already generating PDF 2.0 files today (see https://pdfa.org/supporting-pdf20/):
* UTF-8 support for user-visible strings, such as bookmarks (outlines), OCG layer names, certain annot fields, etc. Missing support results in the ugly display or extraction of "mojibake" for the UTF-8 BoMs. See https://pdfa.org/understanding-utf-8-in-pdf-2-0. Note that this does NOT impact content streams or text extraction (unless you also combine with Logical Structure)! * the latest encryption (AES-GCM 256 bit), dig-sig, hash algorithms, and Unicode password support which is FAR more up-to-date and secure than all legacy crypto. 3rd parties are already generating such files which will otherwise be unreadable/unprocessable by PDFBox. I don't know what crypto library PDFBox uses, but all the new algorithms are standard modern crypto algorithms. Unicode passwords also follow the latest ICU (not knowing what you use internally) but be careful so legacy files continue to work. * if high-quality semantic content extraction is important, then updating for PDF 2.0 standard structure types, although this is likely to be a larger dev effort. If you want to address rendering issues, then: * PLEASE PLEASE PLEASE fix your incorrect rendering of "fill and stroke" in the presence of transparency! See https://github.com/pdf-association/pdf-differences/blob/main/Atomic-Fill%2BStroke/README.md * making sure you using the correct blend mode formulae for ColorBurn and ColorDodge (from 2009 - I have not checked your implementation): https://github.com/pdf-association/pdf-differences/tree/main/ColorBurn-ColorDodge * ensuring negative dash phase is correct (previously unstated as what to do) - see https://github.com/pdf-association/pdf-differences/tree/main/Negative-DashPhase * page-based OutputIntents. Already in use, especially in print-centric workflows where page merging and imposition across PDFs is now far easier to do. Implementation update to select a page-based OutputIntent ahead of the document-level OutputIntent should be relatively easy. * the use of transparency and blend mode of an annotations appearance stream when rendered onto a page (not _within_ the annot). Obviously there are other PDF 2.0 features but these would be my go-to short list for starting to address the most obvious visible differences. See also https://pdfa.org/how-to-get-started-with-pdf-2-0/ since reporting a simple PDF version is unlikely to withstand the test of time... Of course I am also biased 😊 - and I'm not a Java expert! > -----Original Message----- > From: Tilman Hausherr <thaush...@t-online.de> > Sent: Thursday, November 9, 2023 3:35 AM > To: users@pdfbox.apache.org > Subject: Re: PDF 2.0, PDF/A-4 support > > We don't have roadmaps. If you need a PDF 2.0 feature, tell us which one > and why. PDF/A-4 isn't a topic because preflight isn't developed > further. Use VeraPDF instead. You can create PDF/A-4 files like you can > create PDF/A-1b files. > > Tilman > > On 08.11.2023 00:15, Gili Tzabari wrote: > > Hi, > > > > I noticed that PDFBox 3.0 was recently released, but I can't tell what > > the status/roadmap is for PDF 2.0 and PDF/A-4 support. > > > > Can someone in the know please let me know where we stand? > > > > Thanks, > > Gili > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > > For additional commands, e-mail: users-h...@pdfbox.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org