Fantastic, I see the updated schema and the meta.xml validation error has gone away.
thanks! Dan Stoner iDigBio / ACIS Laboratory University of Florida ________________________________________ From: IPT <ipt-boun...@lists.gbif.org> on behalf of Matthew Blissett <mbliss...@gbif.org> Sent: Thursday, December 12, 2019 11:08 AM To: ipt@lists.gbif.org Subject: Re: [IPT] coreid (lowercase "i") vs coreId in meta.xml - schema validation [External Email] Hi Dan, On 12/12/2019 16:30, Stoner, Dan F wrote: > I found some oddities and I am not exactly sure where to go next. > > We are noticing the following while processing meta.xml in darwin core > archives produced by IPT (and other servers): > > Schema validation failed, continuing unvalidated > XMLSyntaxError: Element > '{https://urldefense.proofpoint.com/v2/url?u=http-3A__rs.tdwg.org_dwc_text_-257Dcoreid&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=fO3_RmmAqrY_FB7CQUQFrR425KMvSAdch8_tGWKT7F8&e= > ': This element is not expected. Expected is ( > {https://urldefense.proofpoint.com/v2/url?u=http-3A__rs.tdwg.org_dwc_text_-257DcoreId&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=bYzUXTURlEkia4j-Edo6JwrC48ykHIv7XbMfkQQ1tsw&e= > ) There's some background to this on this issue: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tdwg_dwc_issues_143&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=gRf6kX3lnldAqOlqGquTkImKRbRyA2aIheuV8TcRQBA&e= The schema itself and the documentation were conflicting, and this was fixed (in mine and Tim's opinion) the wrong way, by changing the schema. *I've just pushed a commit to fix it the right way,* i.e. reflecting 99% actual usage and leaving the schema as it was for almost a decade. Although we do accept either, we still see only 31 datasets registered in GBIF with "coreId" rather than "coreid". > It seems like most consumers are not actually validating meta.xml using the > schema, and the producers are generating files out of compliance with the > schema. > > Most of the Darwin Core archives I have manually inspected and tried to > validate contain meta.xml with lowercase "i" in coreid despite the Standard > indicating capital "I" in coreId. > > > I poked at the GBIF Darwin Core Validator 3 code repo and found this: > > schema.meta=https://urldefense.proofpoint.com/v2/url?u=https-3A__raw.githubusercontent.com_tdwg_dwc_master_standard_documents_text_tdwg-5Fdwc-5Ftext.xsd-2Chttp-3A__rs.tdwg.org_dwc_text_tdwg-5Fdwc-5Ftext.xsd&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=gvEhG3t7NeRxRoKNiM0vjl_H_I5MbRH9xQ4lf5CGHGw&e= > > > The first link leads to 404, the second leads to an xsd that contains the > proper coreId. So maybe the Validator is not being "strict" about validation > against the schema? I suspect it has been running for so long that, when the validator process was originally started, both URLs were valid, and had coreid or one of each. Cheers, Matt _______________________________________________ IPT mailing list IPT@lists.gbif.org https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gbif.org_mailman_listinfo_ipt&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=6zz2vV5m4BIw8XvyD2UNq11B_y0niKljWfEB8Nc8YEM&e= _______________________________________________ IPT mailing list IPT@lists.gbif.org https://lists.gbif.org/mailman/listinfo/ipt