[NTG-context] XMP metadata schema yields invalid PDF/A

2022-02-04 Thread Karl Pettersson via ntg-context
Hi

PDF/A files generated using ConTeXt fail validation with veraPDF, and
the reason seems to be that the dc:description metadata is defined with
the wrong type in the embedded XMP extension schema.

https://tex.stackexchange.com/questions/632380/generate-pdf-a-with-context

https://github.com/veraPDF/veraPDF-library/issues/1224

I can reproduce the problem using TeX Live 2021 (MkIV 2021.03.05). The
definition seems to be controlled by this code.

https://source.contextgarden.net/tex/context/base/mkiv/lpdf-pua.xml?search=rdf#l81

Regards

-- 
Karl Pettersson
Uppsala, Sweden

https://static-dust.klpn.se/
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] XMP metadata schema yields invalid PDF/A

2022-02-04 Thread Karl Pettersson via ntg-context
On Fri, Feb 04, 2022 at 10:25:27PM +0100, Hans Hagen via ntg-context wrote:
> On 2/4/2022 7:29 PM, Karl Pettersson via ntg-context wrote:
> > Hi
> > 
> > PDF/A files generated using ConTeXt fail validation with veraPDF, and
> > the reason seems to be that the dc:description metadata is defined with
> > the wrong type in the embedded XMP extension schema.
> > 
> > https://tex.stackexchange.com/questions/632380/generate-pdf-a-with-context
> > 
> > https://github.com/veraPDF/veraPDF-library/issues/1224
> > 
> > I can reproduce the problem using TeX Live 2021 (MkIV 2021.03.05). The
> > definition seems to be controlled by this code.
> > 
> > https://source.contextgarden.net/tex/context/base/mkiv/lpdf-pua.xml?search=rdf#l81
> so "dc:description" is not permitted? it is mentioned in
> 
> https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/elements/1.1/description
> 
> (btw it never failed before)

The description element is permitted, but the problem seems to be that
its valueType is defined as Text in the embedded schema, while the
metadata element in the PDF has an embedded  structure.

https://github.com/veraPDF/veraPDF-library/issues/1224#issuecomment-1029932963

(I suppose the reference to "title" in the issue comment should be 
"description".)

Validating with veraPDF <1.20 does not raise the error. The validation
seems to have changed for redefined types in that version.

https://github.com/veraPDF/veraPDF-library/blob/integration/RELEASENOTES.md#validation

> 
> Hans
> 
> -
>   Hans Hagen | PRAGMA ADE
>   Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
>tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
> -
> ___
> If your question is of interest to others as well, please add an entry to the 
> Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
> archive  : https://bitbucket.org/phg/context-mirror/commits/
> wiki : http://contextgarden.net
> ___

-- 
Karl Pettersson
Uppsala, Sweden

https://static-dust.klpn.se/
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] XMP metadata schema yields invalid PDF/A

2022-02-05 Thread Karl Pettersson via ntg-context
On Sat, Feb 05, 2022 at 12:37:44AM +0100, luigi scarso via ntg-context wrote:
> On Fri, Feb 4, 2022 at 11:11 PM Karl Pettersson via ntg-context <
> ntg-context@ntg.nl> wrote:
> 
> > On Fri, Feb 04, 2022 at 10:25:27PM +0100, Hans Hagen via ntg-context wrote:
> > > On 2/4/2022 7:29 PM, Karl Pettersson via ntg-context wrote:
> > > > Hi
> > > >
> > > > PDF/A files generated using ConTeXt fail validation with veraPDF, and
> > > > the reason seems to be that the dc:description metadata is defined with
> > > > the wrong type in the embedded XMP extension schema.
> > > >
> > > >
> > https://tex.stackexchange.com/questions/632380/generate-pdf-a-with-context
> > > >
> > > > https://github.com/veraPDF/veraPDF-library/issues/1224
> > > >
> > > > I can reproduce the problem using TeX Live 2021 (MkIV 2021.03.05). The
> > > > definition seems to be controlled by this code.
> > > >
> > > >
> > https://source.contextgarden.net/tex/context/base/mkiv/lpdf-pua.xml?search=rdf#l81
> > > so "dc:description" is not permitted? it is mentioned in
> > >
> > >
> > https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/elements/1.1/description
> > >
> > > (btw it never failed before)
> >
> > The description element is permitted, but the problem seems to be that
> > its valueType is defined as Text in the embedded schema, while the
> > metadata element in the PDF has an embedded  > xml:lang="x-default"> structure.
> >
> >
> > https://github.com/veraPDF/veraPDF-library/issues/1224#issuecomment-1029932963
> >
> > (I suppose the reference to "title" in the issue comment should be
> > "description".)
> >
> > Validating with veraPDF <1.20 does not raise the error. The validation
> > seems to have changed for redefined types in that version.
> >
> >
> > https://github.com/veraPDF/veraPDF-library/blob/integration/RELEASENOTES.md#validation
> 
> 
> I am missing something here... true
> FOO
> makes a valid pdf 3a with verapdf 1.20.1.
> But dc:description is like dc:title, so where we are redefining
> dc:description as Text ?
> 
> -- 
> luigi

Attach metadata xml for the non-validating and validating example
(extraced with `pdfinfo -meta | xmllint format -`) in the
Github issue.

Here is a reference to the description element. Note that only
description seems to be redefined, not title.

https://source.contextgarden.net/tex/context/base/mkiv/lpdf-pua.xml#l81

> ___
> If your question is of interest to others as well, please add an entry to the 
> Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
> archive  : https://bitbucket.org/phg/context-mirror/commits/
> wiki : http://contextgarden.net
> ___


-- 
Karl Pettersson
Uppsala, Sweden

https://static-dust.klpn.se/



  
  http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
http://purl.org/dc/elements/1.1/"; rdf:about="">
  application/pdf
  

  AUTHOR

  
  

  

  
  

  TITLE

  

http://ns.adobe.com/pdfx/1.3/"; rdf:about="">
  out | 2022-02-02T21:34:48+01:00
  out
  2022-02-02 21:34
  www.pragma-ade.com
  contextgarden.net
  2019.03.21 21:39
  
  tug.org
  1.10
  7127
  5.3
  linux-64

http://ns.adobe.com/xap/1.0/"; rdf:about="">
  2022-02-02T21:34:48+01:00
  LuaTeX 1.10 7127 + ConTeXt MkIV 2019.03.21 21:39
  2022-02-02T21:34:48+01:00
  2022-02-02T21:34:48+01:00

http://ns.adobe.com/pdf/1.3/"; rdf:about="">
  
  LuaTeX-1.10
  False

http://ns.adobe.com/xap/1.0/mm/"; rdf:about="">
  uuid:77baf08a-41c6-87cb-47d0-807f764f5064
  uuid:fe6c773e-42b0-864d-4c63-da5c57ec1a40

http://www.aiim.org/pdfa/ns/extension/"; xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#"; xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#"; rdf:about="">
  

  
http://ns.adobe.com/pdf/1.3/
pdf
Adobe PDF Schema

  

  internal
  A name object indicating whether the document

Re: [NTG-context] XMP metadata schema yields invalid PDF/A

2022-02-05 Thread Karl Pettersson via ntg-context
On Sat, Feb 05, 2022 at 09:59:44AM +0100, luigi scarso wrote:
 
> it's not redefined -- it's dropped. And this is ok.
> You can also check with
>   FOO
>   
> 
>   TITLE
> 
>   
> and again it's valid.
> But
> https://www.iso.org/obp/ui/#iso:std:iso:19005:-1:ed-1:v2:cor:1:v1:en
> and
> https://www.iso.org/obp/ui/#iso:std:iso:19005:-1:ed-1:v2:cor:2:v1:en:tab:1
> suggest that dc:description and dc:title are of the same type,
> which is coherent with
> XMPSpecificationPart1.pdf
> as in
> https://github.com/adobe/XMP-Toolkit-SDK/tree/main/docs
> 
> I.e. these are correct   --  but Subject must agree with dc:description
>   
> 
>   
> 
>   
>   
> 
>   TITLE
> 
>   
> 
> At least, this is what I understand.
>

>From what I understand, ConTeXt applies a custom Dublin Core schema
containing just a redefinition of dc:description, which is not coherent
with how dc:description is written by the application.

This patch to lpdf-pda.xml removes the validation error.

> -- 
> luigi

-- 
Karl Pettersson, Sweden
Uppsala

https://static-dust.klpn.se/
79a80,97
> 
> http://purl.org/dc/elements/1.1/
> pdf
> Dubline Core 
> Schema
> 
> 
> 
> 
> internal
> Subject in 
> Document Properties
> 
> description
> 
> Text
> 
> 
> 
> 
> 
> 
> 
> 
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] XMP metadata schema yields invalid PDF/A

2022-02-05 Thread Karl Pettersson via ntg-context

> 
> From what I understand, ConTeXt applies a custom Dublin Core schema
> containing just a redefinition of dc:description, which is not coherent
> with how dc:description is written by the application.
> 
> This patch to lpdf-pda.xml removes the validation error.
> 

Sorry, diff the other way round according to diff original new.

-- 
Karl Pettersson
Uppsala, Sweden

https://static-dust.klpn.se/
80,97d79
< 
http://purl.org/dc/elements/1.1/
< pdf
< Dubline Core 
Schema
< 
< 
< 
< 
internal
< Subject in 
Document Properties
< 
description
< 
Text
< 
< 
< 
< 
< 
< 
< 
< 
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] XMP metadata schema yields invalid PDF/A

2022-02-05 Thread Karl Pettersson via ntg-context
On Sat, Feb 05, 2022 at 03:18:08PM +0100, luigi scarso wrote:
>
> 
> Following
> https://www.pdfa.org/wp-content/until2016_uploads/2011/08/pdfa_metadata-2b.pdf
> I think we should replace
> Text
> with
>  Lang Alt
> 
> -- 
> luigi

Yes, if a custom schema is needed at all (from what I can see, the XML
does not embed any type definition for dc:title, for example).

-- 
Karl Pettersson
Uppsala, Sweden

https://static-dust.klpn.se/
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___