Am 28.03.2021 um 18:44 schrieb sahy...@fileaffairs.de:
Am Sonntag, dem 28.03.2021 um 16:36 +0200 schrieb Tilman Hausherr:
I don't have an opinion on XMP because I don't use it.
As XMP is needed for getting/setting metadata esp. since PDF 2.0 there
needs to be support for it - not neccesarily from us directly i.e. we
could integrate a different lib.

I'll revert the work done in PDFBOX-5128 and we get back to it after
3.0 - WDYT?


No, why revert? As far as I understand it, it makes possible that XMPs with non standard schemas can still be parsed so that people can retrieve the standard stuff, so that is very useful.

Tilman




BR
Maruan

Re preflight, I agree with you. It was great but it has hit a dead end,
and VeraPDF is better because it is more flexible.

Tilman

Am 28.03.2021 um 15:52 schrieb Andreas Lehmkuehler:
Am 28.03.21 um 15:00 schrieb sahy...@fileaffairs.de:
Fellow colleagues,

there was some discussion about the ability of XMPBox to parse
arbritary XMP which lead to PDFBOX-5128.

Now, after digging into the code and after reading through the
various
specs for XMP and PDF/A as it stands now XMPBox in it's current
implementation is too restricted from the start as it not only per
default (although there is a way around it) only supports parsing
predefined XMP schemas restricted to the ones defined in PDF/A-1
but
also does some validation in the parsing phase.
Exactly the point where I stopped some time ago, when trying to just
expand the parser ;-)


Now, in order to get to an implementation for arbritary XMP that
needs
to change with the validation for PDF/A-1 put on top. We could use
the
existing implementation in a generalized way, use an existing Java
XMP
parser such as Adobes XMPCore or approach it in a layered fashion
XML -
RDF -> XMP with supporting libs for that.
The other option would be to keep XMPBox as is and for general
purpose
add a general parser into the project or simply refer to XMPCore.

That leads me to the question about the benefit of having a general
purpose (ASL licensed) XMP lib as part of PDFBox? Thoughts?
It replaced JempBox when preflight was added to PDFBox, saying that,
it was a more or less historical reason.

I myself never needed that XMP-stuff. It is used by TIKA and
preflight
and maybe others.

I have to admit that I already thought about the future of preflight.
I've planned to come up with that topic after releasing 3.0.0, but
why
waiting.

Preflight is part of PDFBox but is practically not maintained.
Preflight support is limited to A1B and I don't see anybody who plans
to extend it. VeraPDF has a lot more to offer and is open source as
well, so maybe a better alternative ...

How about removing preflight with 4.0.0? This would remove the one
and
only hard dependency of XMPBox, so that it would be easier to decide
if we really need to maintain out own XMP lib.


Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to