Hi Vignesh
Thanks for the thorough review!
On 04.10.23 11:39, vignesh C wrote:
Few comments:
1) Why the default option was chosen without comments shouldn't it be
the other way round?
+opt_xml_serialize_format:
+ INDENT
{ $$ = XMLSERIALIZE_INDENT; }
+ | NO INDENT
{ $$ = XMLSERIALIZE_NO_FORMAT; }
+ | CANONICAL
{ $$ = XMLSERIALIZE_CANONICAL; }
+ | CANONICAL WITH NO COMMENTS
{ $$ = XMLSERIALIZE_CANONICAL; }
+ | CANONICAL WITH COMMENTS
{ $$ = XMLSERIALIZE_CANONICAL_WITH_COMMENTS; }
+ | /*EMPTY*/
{ $$ = XMLSERIALIZE_NO_FORMAT; }
I'm not sure it is the way to go. The main idea is to check if two
documents have the same content, and comments might be different even if
the contents of two documents are identical. What are your concerns
regarding this default behaviour?
2) This should be added to typedefs.list:
+typedef enum XmlSerializeFormat
+{
+ XMLSERIALIZE_INDENT, /*
pretty-printed xml serialization */
+ XMLSERIALIZE_CANONICAL, /*
canonical form without xml comments */
+ XMLSERIALIZE_CANONICAL_WITH_COMMENTS, /* canonical form with
xml comments */
+ XMLSERIALIZE_NO_FORMAT /*
unformatted xml representation */
+} XmlSerializeFormat;
added.
3) This change is not required:
return result;
+
#else
NO_XML_SUPPORT();
return NULL;
removed.
4) This comment body needs slight reformatting:
+ /*
+ * Parse the input according to the xmloption.
+ * XML canonical expects a well-formed XML input, so here in case of
+ * XMLSERIALIZE_CANONICAL or XMLSERIALIZE_CANONICAL_WITH_COMMENTS we
+ * force xml_parse() to parse 'data' as XMLOPTION_DOCUMENT despite
+ * of the XmlOptionType given in 'xmloption_arg'. This enables the
+ * canonicalization of CONTENT fragments if they contain a singly-rooted
+ * XML - xml_parse() will thrown an error otherwise.
+ */
reformatted.
5) Similarly here too:
- if (newline == NULL || xmlerrcxt->err_occurred)
+ * Emit declaration only if the input had one.
Note: some versions of
+ * xmlSaveToBuffer leak memory if a non-null
encoding argument is
+ * passed, so don't do that. We don't want any
encoding conversion
+ * anyway.
+ */
+ if (decl_len == 0)
reformatted.
6) Similarly here too:
+ /*
+ * Deal with the case where we have
non-singly-rooted XML.
+ * libxml's dump functions don't work
well for that without help.
+ * We build a fake root node that
serves as a container for the
+ * content nodes, and then iterate over
the nodes.
+ */
reformatted.
7) Similarly here too:
+ /*
+ * We use this node to insert newlines
in the dump. Note: in at
+ * least some libxml versions,
xmlNewDocText would not attach the
+ * node to the document even if we
passed it. Therefore, manage
+ * freeing of this node manually, and
pass NULL here to make sure
+ * there's not a dangling link.
+ */
reformatted.
8) Should this:
+ * of the XmlOptionType given in 'xmloption_arg'. This enables the
+ * canonicalization of CONTENT fragments if they contain a singly-rooted
+ * XML - xml_parse() will thrown an error otherwise.
Be:
+ * of the XmlOptionType given in 'xmloption_arg'. This enables the
+ * canonicalization of CONTENT fragments if they contain a singly-rooted
+ * XML - xml_parse() will throw an error otherwise.
I didn't understand the suggestion in 8) :)
Thanks again for the review. Much appreciated!
v7 attached.
Best, Jim
From 4bd06615b9aa9f3f0fcebdd1bc30a0500524cdad Mon Sep 17 00:00:00 2001
From: Jim Jones <jim.jo...@uni-muenster.de>
Date: Wed, 4 Oct 2023 17:58:24 +0200
Subject: [PATCH v7] Add CANONICAL output format to xmlserialize
This patch introduces the CANONICAL option to xmlserialize, which
serializes xml documents in their canonical form - as described in
the W3C Canonical XML Version 1.1 specification. This option can
be used with the additional parameter WITH [NO] COMMENTS to keep
or remove xml comments from the canonical xml output. This feature
is based on the function xmlC14NDocDumpMemory from the C14N module
of libxml2.
This patch also includes regression tests and documentation.
---
doc/src/sgml/datatype.sgml | 41 +++-
src/backend/executor/execExprInterp.c | 2 +-
src/backend/parser/gram.y | 21 +-
src/backend/parser/parse_expr.c | 2 +-
src/backend/utils/adt/xml.c | 272 ++++++++++++++++----------
src/include/nodes/parsenodes.h | 1 +
src/include/nodes/primnodes.h | 12 +-
src/include/parser/kwlist.h | 1 +
src/include/utils/xml.h | 2 +-
src/test/regress/expected/xml.out | 112 +++++++++++
src/test/regress/expected/xml_1.out | 108 ++++++++++
src/test/regress/expected/xml_2.out | 112 +++++++++++
src/test/regress/sql/xml.sql | 63 ++++++
src/tools/pgindent/typedefs.list | 1 +
14 files changed, 638 insertions(+), 112 deletions(-)
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 5d23765705..89feb530c6 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4465,7 +4465,7 @@ xml '<foo>bar</foo>'
<type>xml</type>, uses the function
<function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
<synopsis>
-XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> [ [ NO ] INDENT ] )
+XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> [ { [ NO ] INDENT ] | CANONICAL [ WITH [NO] COMMENTS ]})
</synopsis>
<replaceable>type</replaceable> can be
<type>character</type>, <type>character varying</type>, or
@@ -4482,6 +4482,45 @@ XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <repla
type likewise produces the original string.
</para>
+ <para>
+ The option <type>CANONICAL</type> converts a given
+ XML document to its <ulink url="https://www.w3.org/TR/xml-c14n11/#Terminology">canonical form</ulink>
+ based on the <ulink url="https://www.w3.org/TR/xml-c14n11/">W3C Canonical XML 1.1 Specification</ulink>.
+ It is basically designed to provide applications the ability to compare xml documents or test if they
+ have been changed. The optional parameter <type>WITH [NO] COMMENTS</type> removes or keeps XML comments
+ from the given document.
+ </para>
+
+ <para>
+ Example:
+
+<screen><![CDATA[
+SELECT
+ xmlserialize(DOCUMENT
+ '<foo>
+ <!-- a comment -->
+ <bar c="3" b="2" a="1">42</bar>
+ <empty/>
+ </foo>'::xml AS text CANONICAL);
+ xmlserialize
+-----------------------------------------------------------
+ <foo><bar a="1" b="2" c="3">42</bar><empty></empty></foo>
+(1 row)
+
+SELECT
+ xmlserialize(DOCUMENT
+ '<foo>
+ <!-- a comment -->
+ <bar c="3" b="2" a="1">42</bar>
+ <empty/>
+ </foo>'::xml AS text CANONICAL WITH COMMENTS);
+ xmlserialize
+-----------------------------------------------------------------------------
+ <foo><!-- a comment --><bar a="1" b="2" c="3">42</bar><empty></empty></foo>
+(1 row)
+
+]]></screen>
+ </para>
<para>
When a character string value is cast to or from type
<type>xml</type> without going through <type>XMLPARSE</type> or
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 24c2b60c62..d9305c28b0 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -3943,7 +3943,7 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
*op->resvalue =
PointerGetDatum(xmltotext_with_options(DatumGetXmlP(value),
xexpr->xmloption,
- xexpr->indent));
+ xexpr->format));
*op->resnull = false;
}
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e56cbe77cb..8123e9207a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -615,12 +615,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> xml_root_version opt_xml_root_standalone
%type <node> xmlexists_argument
%type <ival> document_or_content
-%type <boolean> xml_indent_option xml_whitespace_option
+%type <boolean> xml_whitespace_option
%type <list> xmltable_column_list xmltable_column_option_list
%type <node> xmltable_column_el
%type <defelt> xmltable_column_option_el
%type <list> xml_namespace_list
%type <target> xml_namespace_el
+%type <ival> opt_xml_serialize_format
%type <node> func_application func_expr_common_subexpr
%type <node> func_expr func_expr_windowless
@@ -692,7 +693,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
BOOLEAN_P BOTH BREADTH BY
- CACHE CALL CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
+ CACHE CALL CALLED CANONICAL CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
CLUSTER COALESCE COLLATE COLLATION COLUMN COLUMNS COMMENT COMMENTS COMMIT
COMMITTED COMPRESSION CONCURRENTLY CONFIGURATION CONFLICT
@@ -15634,14 +15635,14 @@ func_expr_common_subexpr:
$$ = makeXmlExpr(IS_XMLROOT, NULL, NIL,
list_make3($3, $5, $6), @1);
}
- | XMLSERIALIZE '(' document_or_content a_expr AS SimpleTypename xml_indent_option ')'
+ | XMLSERIALIZE '(' document_or_content a_expr AS SimpleTypename opt_xml_serialize_format ')'
{
XmlSerialize *n = makeNode(XmlSerialize);
n->xmloption = $3;
n->expr = $4;
n->typeName = $6;
- n->indent = $7;
+ n->format = $7;
n->location = @1;
$$ = (Node *) n;
}
@@ -15797,9 +15798,13 @@ document_or_content: DOCUMENT_P { $$ = XMLOPTION_DOCUMENT; }
| CONTENT_P { $$ = XMLOPTION_CONTENT; }
;
-xml_indent_option: INDENT { $$ = true; }
- | NO INDENT { $$ = false; }
- | /*EMPTY*/ { $$ = false; }
+opt_xml_serialize_format:
+ INDENT { $$ = XMLSERIALIZE_INDENT; }
+ | NO INDENT { $$ = XMLSERIALIZE_NO_FORMAT; }
+ | CANONICAL { $$ = XMLSERIALIZE_CANONICAL; }
+ | CANONICAL WITH NO COMMENTS { $$ = XMLSERIALIZE_CANONICAL; }
+ | CANONICAL WITH COMMENTS { $$ = XMLSERIALIZE_CANONICAL_WITH_COMMENTS; }
+ | /*EMPTY*/ { $$ = XMLSERIALIZE_NO_FORMAT; }
;
xml_whitespace_option: PRESERVE WHITESPACE_P { $$ = true; }
@@ -17084,6 +17089,7 @@ unreserved_keyword:
| CACHE
| CALL
| CALLED
+ | CANONICAL
| CASCADE
| CASCADED
| CATALOG_P
@@ -17618,6 +17624,7 @@ bare_label_keyword:
| CACHE
| CALL
| CALLED
+ | CANONICAL
| CASCADE
| CASCADED
| CASE
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 64c582c344..624b0b8844 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -2443,7 +2443,7 @@ transformXmlSerialize(ParseState *pstate, XmlSerialize *xs)
typenameTypeIdAndMod(pstate, xs->typeName, &targetType, &targetTypmod);
xexpr->xmloption = xs->xmloption;
- xexpr->indent = xs->indent;
+ xexpr->format = xs->format;
xexpr->location = xs->location;
/* We actually only need these to be able to parse back the expression. */
xexpr->type = targetType;
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 2300c7ebf3..ce87451942 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -57,6 +57,7 @@
#include <libxml/xmlwriter.h>
#include <libxml/xpath.h>
#include <libxml/xpathInternals.h>
+#include <libxml/c14n.h>
/*
* We used to check for xmlStructuredErrorContext via a configure test; but
@@ -622,7 +623,7 @@ xmltotext(PG_FUNCTION_ARGS)
text *
-xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent)
+xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, XmlSerializeFormat format)
{
#ifdef USE_LIBXML
text *volatile result;
@@ -635,7 +636,7 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent)
PgXmlErrorContext *xmlerrcxt;
#endif
- if (xmloption_arg != XMLOPTION_DOCUMENT && !indent)
+ if (xmloption_arg != XMLOPTION_DOCUMENT && format == XMLSERIALIZE_NO_FORMAT)
{
/*
* We don't actually need to do anything, so just return the
@@ -646,10 +647,23 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent)
}
#ifdef USE_LIBXML
- /* Parse the input according to the xmloption */
- doc = xml_parse(data, xmloption_arg, true, GetDatabaseEncoding(),
- &parsed_xmloptiontype, &content_nodes,
- (Node *) &escontext);
+ /*
+ * Parse the input according to the xmloption. XML canonical expects
+ * a well-formed XML input, so here in case of XMLSERIALIZE_CANONICAL
+ * or XMLSERIALIZE_CANONICAL_WITH_COMMENTS we force xml_parse() to parse
+ * 'data' as XMLOPTION_DOCUMENT despite of the XmlOptionType given in
+ * 'xmloption_arg'. This enables the canonicalization of CONTENT fragments
+ * if they contain a singly-rooted XML - xml_parse() will thrown an error
+ * otherwise.
+ */
+ if(format == XMLSERIALIZE_CANONICAL || format == XMLSERIALIZE_CANONICAL_WITH_COMMENTS)
+ doc = xml_parse(data, XMLOPTION_DOCUMENT, false,
+ GetDatabaseEncoding(), NULL, NULL, NULL);
+ else
+ doc = xml_parse(data, xmloption_arg, true, GetDatabaseEncoding(),
+ &parsed_xmloptiontype, &content_nodes,
+ (Node *) &escontext);
+
if (doc == NULL || escontext.error_occurred)
{
if (doc)
@@ -661,7 +675,7 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent)
}
/* If we weren't asked to indent, we're done. */
- if (!indent)
+ if (format == XMLSERIALIZE_NO_FORMAT)
{
xmlFreeDoc(doc);
return (text *) data;
@@ -670,128 +684,188 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent)
/* Otherwise, we gotta spin up some error handling. */
xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
- PG_TRY();
+ if(format == XMLSERIALIZE_INDENT)
{
- size_t decl_len = 0;
-
- /* The serialized data will go into this buffer. */
- buf = xmlBufferCreate();
-
- if (buf == NULL || xmlerrcxt->err_occurred)
- xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
- "could not allocate xmlBuffer");
-
- /* Detect whether there's an XML declaration */
- parse_xml_decl(xml_text2xmlChar(data), &decl_len, NULL, NULL, NULL);
-
- /*
- * Emit declaration only if the input had one. Note: some versions of
- * xmlSaveToBuffer leak memory if a non-null encoding argument is
- * passed, so don't do that. We don't want any encoding conversion
- * anyway.
- */
- if (decl_len == 0)
- ctxt = xmlSaveToBuffer(buf, NULL,
- XML_SAVE_NO_DECL | XML_SAVE_FORMAT);
- else
- ctxt = xmlSaveToBuffer(buf, NULL,
- XML_SAVE_FORMAT);
-
- if (ctxt == NULL || xmlerrcxt->err_occurred)
- xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
- "could not allocate xmlSaveCtxt");
-
- if (parsed_xmloptiontype == XMLOPTION_DOCUMENT)
- {
- /* If it's a document, saving is easy. */
- if (xmlSaveDoc(ctxt, doc) == -1 || xmlerrcxt->err_occurred)
- xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
- "could not save document to xmlBuffer");
- }
- else if (content_nodes != NULL)
+ PG_TRY();
{
- /*
- * Deal with the case where we have non-singly-rooted XML.
- * libxml's dump functions don't work well for that without help.
- * We build a fake root node that serves as a container for the
- * content nodes, and then iterate over the nodes.
- */
- xmlNodePtr root;
- xmlNodePtr newline;
+ size_t decl_len = 0;
+
+ /* The serialized data will go into this buffer. */
+ buf = xmlBufferCreate();
- root = xmlNewNode(NULL, (const xmlChar *) "content-root");
- if (root == NULL || xmlerrcxt->err_occurred)
+ if (buf == NULL || xmlerrcxt->err_occurred)
xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
- "could not allocate xml node");
+ "could not allocate xmlBuffer");
- /* This attaches root to doc, so we need not free it separately. */
- xmlDocSetRootElement(doc, root);
- xmlAddChild(root, content_nodes);
+ /* Detect whether there's an XML declaration */
+ parse_xml_decl(xml_text2xmlChar(data), &decl_len, NULL, NULL, NULL);
/*
- * We use this node to insert newlines in the dump. Note: in at
- * least some libxml versions, xmlNewDocText would not attach the
- * node to the document even if we passed it. Therefore, manage
- * freeing of this node manually, and pass NULL here to make sure
- * there's not a dangling link.
+ * Emit declaration only if the input had one. Note: some versions of
+ * xmlSaveToBuffer leak memory if a non-null encoding argument is
+ * passed, so don't do that. We don't want any encoding conversion
+ * anyway.
*/
- newline = xmlNewDocText(NULL, (const xmlChar *) "\n");
- if (newline == NULL || xmlerrcxt->err_occurred)
+ if (decl_len == 0)
+ ctxt = xmlSaveToBuffer(buf, NULL,
+ XML_SAVE_NO_DECL | XML_SAVE_FORMAT);
+ else
+ ctxt = xmlSaveToBuffer(buf, NULL,
+ XML_SAVE_FORMAT);
+
+ if (ctxt == NULL || xmlerrcxt->err_occurred)
xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
- "could not allocate xml node");
+ "could not allocate xmlSaveCtxt");
- for (xmlNodePtr node = root->children; node; node = node->next)
+ if (parsed_xmloptiontype == XMLOPTION_DOCUMENT)
{
- /* insert newlines between nodes */
- if (node->type != XML_TEXT_NODE && node->prev != NULL)
+ /* If it's a document, saving is easy. */
+ if (xmlSaveDoc(ctxt, doc) == -1 || xmlerrcxt->err_occurred)
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
+ "could not save document to xmlBuffer");
+ }
+ else if (content_nodes != NULL)
+ {
+ /*
+ * Deal with the case where we have non-singly-rooted XML.
+ * libxml's dump functions don't work well for that without help.
+ * We build a fake root node that serves as a container for the
+ * content nodes, and then iterate over the nodes.
+ */
+ xmlNodePtr root;
+ xmlNodePtr newline;
+
+ root = xmlNewNode(NULL, (const xmlChar *) "content-root");
+ if (root == NULL || xmlerrcxt->err_occurred)
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+ "could not allocate xml node");
+
+ /* This attaches root to doc, so we need not free it separately. */
+ xmlDocSetRootElement(doc, root);
+ xmlAddChild(root, content_nodes);
+
+ /*
+ * We use this node to insert newlines in the dump. Note: in at
+ * least some libxml versions, xmlNewDocText would not attach the
+ * node to the document even if we passed it. Therefore, manage
+ * freeing of this node manually, and pass NULL here to make sure
+ * there's not a dangling link.
+ */
+ newline = xmlNewDocText(NULL, (const xmlChar *) "\n");
+ if (newline == NULL || xmlerrcxt->err_occurred)
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+ "could not allocate xml node");
+
+ for (xmlNodePtr node = root->children; node; node = node->next)
{
- if (xmlSaveTree(ctxt, newline) == -1 || xmlerrcxt->err_occurred)
+ /* insert newlines between nodes */
+ if (node->type != XML_TEXT_NODE && node->prev != NULL)
+ {
+ if (xmlSaveTree(ctxt, newline) == -1 || xmlerrcxt->err_occurred)
+ {
+ xmlFreeNode(newline);
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
+ "could not save newline to xmlBuffer");
+ }
+ }
+
+ if (xmlSaveTree(ctxt, node) == -1 || xmlerrcxt->err_occurred)
{
xmlFreeNode(newline);
xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
- "could not save newline to xmlBuffer");
+ "could not save content to xmlBuffer");
}
}
- if (xmlSaveTree(ctxt, node) == -1 || xmlerrcxt->err_occurred)
- {
- xmlFreeNode(newline);
- xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
- "could not save content to xmlBuffer");
- }
+ xmlFreeNode(newline);
}
- xmlFreeNode(newline);
- }
+ if (xmlSaveClose(ctxt) == -1 || xmlerrcxt->err_occurred)
+ {
+ ctxt = NULL; /* don't try to close it again */
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
+ "could not close xmlSaveCtxtPtr");
+ }
- if (xmlSaveClose(ctxt) == -1 || xmlerrcxt->err_occurred)
+ result = (text *) xmlBuffer_to_xmltype(buf);
+ }
+ PG_CATCH();
{
- ctxt = NULL; /* don't try to close it again */
- xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
- "could not close xmlSaveCtxtPtr");
+ if (ctxt)
+ xmlSaveClose(ctxt);
+ if (buf)
+ xmlBufferFree(buf);
+ if (doc)
+ xmlFreeDoc(doc);
+
+ pg_xml_done(xmlerrcxt, true);
+
+ PG_RE_THROW();
}
+ PG_END_TRY();
+
+ xmlBufferFree(buf);
+ xmlFreeDoc(doc);
- result = (text *) xmlBuffer_to_xmltype(buf);
+ pg_xml_done(xmlerrcxt, false);
}
- PG_CATCH();
+ else if (format == XMLSERIALIZE_CANONICAL || format == XMLSERIALIZE_CANONICAL_WITH_COMMENTS)
{
- if (ctxt)
- xmlSaveClose(ctxt);
- if (buf)
- xmlBufferFree(buf);
- if (doc)
- xmlFreeDoc(doc);
+ xmlChar *xmlbuf = NULL;
+ int nbytes;
+ int with_comments = 0; /* 0 = no xml comments (default) */
- pg_xml_done(xmlerrcxt, true);
+ PG_TRY();
+ {
+ /* 1 = keeps xml comments */
+ if (format == XMLSERIALIZE_CANONICAL_WITH_COMMENTS)
+ with_comments = 1;
- PG_RE_THROW();
- }
- PG_END_TRY();
+ if (doc == NULL || escontext.error_occurred)
+ {
+ if (doc)
+ xmlFreeDoc(doc);
+ /* A soft error must be failure to conform to XMLOPTION_DOCUMENT */
+ ereport(ERROR,
+ (errcode(ERRCODE_NOT_AN_XML_DOCUMENT),
+ errmsg("not an XML document")));
+ }
- xmlBufferFree(buf);
- xmlFreeDoc(doc);
+ /*
+ * This dumps the canonicalized XML doc into the xmlChar* buffer.
+ * mode = 2 means the doc will be canonicalized using the C14N 1.1 standard.
+ */
+ nbytes = xmlC14NDocDumpMemory(doc, NULL, 2, NULL, with_comments, &xmlbuf);
- pg_xml_done(xmlerrcxt, false);
+ if(nbytes < 0 || escontext.error_occurred)
+ ereport(ERROR,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("could not canonicalize the given XML document")));
+
+ result = cstring_to_text_with_len((const char *) xmlbuf, nbytes);
+ }
+ PG_CATCH();
+ {
+ if (ctxt)
+ xmlSaveClose(ctxt);
+ if (xmlbuf)
+ xmlFree(xmlbuf);
+ if (doc)
+ xmlFreeDoc(doc);
+
+ pg_xml_done(xmlerrcxt, true);
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+
+ xmlFreeDoc(doc);
+ xmlFree(xmlbuf);
+
+ pg_xml_done(xmlerrcxt, false);
+ }
+ else
+ elog(ERROR,"invalid xmlserialize option");
return result;
#else
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f637937cd2..e62d96d383 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -840,6 +840,7 @@ typedef struct XmlSerialize
Node *expr;
TypeName *typeName;
bool indent; /* [NO] INDENT */
+ XmlSerializeFormat format; /* serialization format */
int location; /* token location, or -1 if unknown */
} XmlSerialize;
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 60d72a876b..88595032cc 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1520,6 +1520,14 @@ typedef enum XmlOptionType
XMLOPTION_CONTENT
} XmlOptionType;
+typedef enum XmlSerializeFormat
+{
+ XMLSERIALIZE_INDENT, /* pretty-printed xml serialization */
+ XMLSERIALIZE_CANONICAL, /* canonical form without xml comments */
+ XMLSERIALIZE_CANONICAL_WITH_COMMENTS, /* canonical form with xml comments */
+ XMLSERIALIZE_NO_FORMAT /* unformatted xml representation */
+} XmlSerializeFormat;
+
typedef struct XmlExpr
{
Expr xpr;
@@ -1535,13 +1543,13 @@ typedef struct XmlExpr
List *args;
/* DOCUMENT or CONTENT */
XmlOptionType xmloption pg_node_attr(query_jumble_ignore);
- /* INDENT option for XMLSERIALIZE */
- bool indent;
/* target type/typmod for XMLSERIALIZE */
Oid type pg_node_attr(query_jumble_ignore);
int32 typmod pg_node_attr(query_jumble_ignore);
/* token location, or -1 if unknown */
int location;
+ /* serialization format: XMLCANONICAL, XMLCANONICAL_WITH_COMMENTS, XMLINDENT */
+ XmlSerializeFormat format pg_node_attr(query_jumble_ignore);
} XmlExpr;
/*
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 5984dcfa4b..3233879cc3 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -68,6 +68,7 @@ PG_KEYWORD("by", BY, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("cache", CACHE, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("call", CALL, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("called", CALLED, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("canonical", CANONICAL, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("cascade", CASCADE, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("cascaded", CASCADED, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("case", CASE, RESERVED_KEYWORD, BARE_LABEL)
diff --git a/src/include/utils/xml.h b/src/include/utils/xml.h
index 224f6d75ff..b74f216148 100644
--- a/src/include/utils/xml.h
+++ b/src/include/utils/xml.h
@@ -78,7 +78,7 @@ extern xmltype *xmlpi(const char *target, text *arg, bool arg_is_null, bool *res
extern xmltype *xmlroot(xmltype *data, text *version, int standalone);
extern bool xml_is_document(xmltype *arg);
extern text *xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg,
- bool indent);
+ XmlSerializeFormat format);
extern char *escape_xml(const char *str);
extern char *map_sql_identifier_to_xml_name(const char *ident, bool fully_escaped, bool escape_period);
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 398345ca67..83fbdc5223 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -672,6 +672,118 @@ SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text
t
(1 row)
+-- xmlserialize: canonical
+CREATE TABLE xmltest_serialize (id int, doc xml);
+INSERT INTO xmltest_serialize VALUES
+ (1,'<?xml version="1.0" encoding="ISO-8859-1"?>
+ <!DOCTYPE doc SYSTEM "doc.dtd" [
+ <!ENTITY val "42">
+ <!ATTLIST xyz attr CDATA "default">
+ ]>
+
+ <!-- attributes and namespces will be sorted -->
+ <foo a:attr="out" b:attr="sorted" attr2="all" attr="I am"
+ xmlns:b="http://www.ietf.org"
+ xmlns:a="http://www.w3.org"
+ xmlns="http://example.org">
+
+ <!-- Normalization of whitespace in start and end tags -->
+ <!-- Elimination of superfluous namespace declarations, as already declared in <foo> -->
+ <bar xmlns="" xmlns:a="http://www.w3.org" >&val;</bar >
+
+ <!-- empty element will be converted to start-end tag pair -->
+ <empty/>
+
+ <!-- text will be transcoded to UTF-8 -->
+ <transcode>1</transcode>
+
+ <!-- default attribute will be added -->
+ <!-- whitespace inside tag will be preserved -->
+ <whitespace> 321 </whitespace>
+
+ <!-- empty namespace will be removed of child tag -->
+ <emptyns xmlns="" >
+ <emptyns_child xmlns=""></emptyns_child>
+ </emptyns>
+
+ <!-- CDATA section will be replaced by its value -->
+ <compute><![CDATA[value>"0" && value<"10" ?"valid":"error"]]></compute>
+ </foo>
+ <!-- comment outside doc -->'::xml),
+ (2,'<foo>
+ <bar>
+ <!-- important comment -->
+ <val x="y">42</val>
+ </bar>
+ </foo> '::xml);
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+ xmlserialize
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ <foo xmlns="http://example.org" xmlns:a="http://www.w3.org" xmlns:b="http://www.ietf.org" attr="I am" attr2="all" b:attr="sorted" a:attr="out"><bar xmlns="">42</bar><empty></empty><transcode>1</transcode><whitespace> 321 </whitespace><emptyns xmlns=""><emptyns_child></emptyns_child></emptyns><compute>value>"0" && value<"10" ?"valid":"error"</compute></foo>
+(1 row)
+
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+ xmlserialize
+---------------------------------------------------------------------
+ <foo><bar><!-- important comment --><val x="y">42</val></bar></foo>
+(1 row)
+
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) = xmlserialize(DOCUMENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+ ?column?
+----------
+ t
+ t
+(2 rows)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+ xmlserialize
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ <foo xmlns="http://example.org" xmlns:a="http://www.w3.org" xmlns:b="http://www.ietf.org" attr="I am" attr2="all" b:attr="sorted" a:attr="out"><bar xmlns="">42</bar><empty></empty><transcode>1</transcode><whitespace> 321 </whitespace><emptyns xmlns=""><emptyns_child></emptyns_child></emptyns><compute>value>"0" && value<"10" ?"valid":"error"</compute></foo>
+(1 row)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+ xmlserialize
+---------------------------------------------------------------------
+ <foo><bar><!-- important comment --><val x="y">42</val></bar></foo>
+(1 row)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) = xmlserialize(CONTENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+ ?column?
+----------
+ t
+ t
+(2 rows)
+
+SELECT xmlserialize(DOCUMENT NULL AS text CANONICAL);
+ xmlserialize
+--------------
+
+(1 row)
+
+SELECT xmlserialize(CONTENT NULL AS text CANONICAL);
+ xmlserialize
+--------------
+
+(1 row)
+
+\set VERBOSITY terse
+SELECT xmlserialize(DOCUMENT '' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(DOCUMENT ' ' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(DOCUMENT 'foo' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(CONTENT '' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(CONTENT ' ' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(CONTENT 'foo' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(DOCUMENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+ERROR: syntax error at or near "INDENT" at character 75
+SELECT xmlserialize(CONTENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+ERROR: syntax error at or near "INDENT" at character 74
+\set VERBOSITY default
SELECT xml '<foo>bar</foo>' IS DOCUMENT;
?column?
----------
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index 63b779470f..481badaa84 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -443,6 +443,114 @@ ERROR: unsupported XML feature
LINE 1: SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val><...
^
DETAIL: This functionality requires the server to be built with libxml support.
+-- xmlserialize: canonical
+CREATE TABLE xmltest_serialize (id int, doc xml);
+INSERT INTO xmltest_serialize VALUES
+ (1,'<?xml version="1.0" encoding="ISO-8859-1"?>
+ <!DOCTYPE doc SYSTEM "doc.dtd" [
+ <!ENTITY val "42">
+ <!ATTLIST xyz attr CDATA "default">
+ ]>
+
+ <!-- attributes and namespces will be sorted -->
+ <foo a:attr="out" b:attr="sorted" attr2="all" attr="I am"
+ xmlns:b="http://www.ietf.org"
+ xmlns:a="http://www.w3.org"
+ xmlns="http://example.org">
+
+ <!-- Normalization of whitespace in start and end tags -->
+ <!-- Elimination of superfluous namespace declarations, as already declared in <foo> -->
+ <bar xmlns="" xmlns:a="http://www.w3.org" >&val;</bar >
+
+ <!-- empty element will be converted to start-end tag pair -->
+ <empty/>
+
+ <!-- text will be transcoded to UTF-8 -->
+ <transcode>1</transcode>
+
+ <!-- default attribute will be added -->
+ <!-- whitespace inside tag will be preserved -->
+ <whitespace> 321 </whitespace>
+
+ <!-- empty namespace will be removed of child tag -->
+ <emptyns xmlns="" >
+ <emptyns_child xmlns=""></emptyns_child>
+ </emptyns>
+
+ <!-- CDATA section will be replaced by its value -->
+ <compute><![CDATA[value>"0" && value<"10" ?"valid":"error"]]></compute>
+ </foo>
+ <!-- comment outside doc -->'::xml),
+ (2,'<foo>
+ <bar>
+ <!-- important comment -->
+ <val x="y">42</val>
+ </bar>
+ </foo> '::xml);
+ERROR: unsupported XML feature
+LINE 2: (1,'<?xml version="1.0" encoding="ISO-8859-1"?>
+ ^
+DETAIL: This functionality requires the server to be built with libxml support.
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+ xmlserialize
+--------------
+(0 rows)
+
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+ xmlserialize
+--------------
+(0 rows)
+
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) = xmlserialize(DOCUMENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+ ?column?
+----------
+(0 rows)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+ xmlserialize
+--------------
+(0 rows)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+ xmlserialize
+--------------
+(0 rows)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) = xmlserialize(CONTENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+ ?column?
+----------
+(0 rows)
+
+SELECT xmlserialize(DOCUMENT NULL AS text CANONICAL);
+ xmlserialize
+--------------
+
+(1 row)
+
+SELECT xmlserialize(CONTENT NULL AS text CANONICAL);
+ xmlserialize
+--------------
+
+(1 row)
+
+\set VERBOSITY terse
+SELECT xmlserialize(DOCUMENT '' AS text CANONICAL);
+ERROR: unsupported XML feature at character 30
+SELECT xmlserialize(DOCUMENT ' ' AS text CANONICAL);
+ERROR: unsupported XML feature at character 30
+SELECT xmlserialize(DOCUMENT 'foo' AS text CANONICAL);
+ERROR: unsupported XML feature at character 30
+SELECT xmlserialize(CONTENT '' AS text CANONICAL);
+ERROR: unsupported XML feature at character 29
+SELECT xmlserialize(CONTENT ' ' AS text CANONICAL);
+ERROR: unsupported XML feature at character 29
+SELECT xmlserialize(CONTENT 'foo' AS text CANONICAL);
+ERROR: unsupported XML feature at character 29
+SELECT xmlserialize(DOCUMENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+ERROR: syntax error at or near "INDENT" at character 75
+SELECT xmlserialize(CONTENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+ERROR: syntax error at or near "INDENT" at character 74
+\set VERBOSITY default
SELECT xml '<foo>bar</foo>' IS DOCUMENT;
ERROR: unsupported XML feature
LINE 1: SELECT xml '<foo>bar</foo>' IS DOCUMENT;
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 43c2558352..6dea6ca38d 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -652,6 +652,118 @@ SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text
t
(1 row)
+-- xmlserialize: canonical
+CREATE TABLE xmltest_serialize (id int, doc xml);
+INSERT INTO xmltest_serialize VALUES
+ (1,'<?xml version="1.0" encoding="ISO-8859-1"?>
+ <!DOCTYPE doc SYSTEM "doc.dtd" [
+ <!ENTITY val "42">
+ <!ATTLIST xyz attr CDATA "default">
+ ]>
+
+ <!-- attributes and namespces will be sorted -->
+ <foo a:attr="out" b:attr="sorted" attr2="all" attr="I am"
+ xmlns:b="http://www.ietf.org"
+ xmlns:a="http://www.w3.org"
+ xmlns="http://example.org">
+
+ <!-- Normalization of whitespace in start and end tags -->
+ <!-- Elimination of superfluous namespace declarations, as already declared in <foo> -->
+ <bar xmlns="" xmlns:a="http://www.w3.org" >&val;</bar >
+
+ <!-- empty element will be converted to start-end tag pair -->
+ <empty/>
+
+ <!-- text will be transcoded to UTF-8 -->
+ <transcode>1</transcode>
+
+ <!-- default attribute will be added -->
+ <!-- whitespace inside tag will be preserved -->
+ <whitespace> 321 </whitespace>
+
+ <!-- empty namespace will be removed of child tag -->
+ <emptyns xmlns="" >
+ <emptyns_child xmlns=""></emptyns_child>
+ </emptyns>
+
+ <!-- CDATA section will be replaced by its value -->
+ <compute><![CDATA[value>"0" && value<"10" ?"valid":"error"]]></compute>
+ </foo>
+ <!-- comment outside doc -->'::xml),
+ (2,'<foo>
+ <bar>
+ <!-- important comment -->
+ <val x="y">42</val>
+ </bar>
+ </foo> '::xml);
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+ xmlserialize
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ <foo xmlns="http://example.org" xmlns:a="http://www.w3.org" xmlns:b="http://www.ietf.org" attr="I am" attr2="all" b:attr="sorted" a:attr="out"><bar xmlns="">42</bar><empty></empty><transcode>1</transcode><whitespace> 321 </whitespace><emptyns xmlns=""><emptyns_child></emptyns_child></emptyns><compute>value>"0" && value<"10" ?"valid":"error"</compute></foo>
+(1 row)
+
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+ xmlserialize
+---------------------------------------------------------------------
+ <foo><bar><!-- important comment --><val x="y">42</val></bar></foo>
+(1 row)
+
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) = xmlserialize(DOCUMENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+ ?column?
+----------
+ t
+ t
+(2 rows)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+ xmlserialize
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ <foo xmlns="http://example.org" xmlns:a="http://www.w3.org" xmlns:b="http://www.ietf.org" attr="I am" attr2="all" b:attr="sorted" a:attr="out"><bar xmlns="">42</bar><empty></empty><transcode>1</transcode><whitespace> 321 </whitespace><emptyns xmlns=""><emptyns_child></emptyns_child></emptyns><compute>value>"0" && value<"10" ?"valid":"error"</compute></foo>
+(1 row)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+ xmlserialize
+---------------------------------------------------------------------
+ <foo><bar><!-- important comment --><val x="y">42</val></bar></foo>
+(1 row)
+
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) = xmlserialize(CONTENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+ ?column?
+----------
+ t
+ t
+(2 rows)
+
+SELECT xmlserialize(DOCUMENT NULL AS text CANONICAL);
+ xmlserialize
+--------------
+
+(1 row)
+
+SELECT xmlserialize(CONTENT NULL AS text CANONICAL);
+ xmlserialize
+--------------
+
+(1 row)
+
+\set VERBOSITY terse
+SELECT xmlserialize(DOCUMENT '' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(DOCUMENT ' ' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(DOCUMENT 'foo' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(CONTENT '' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(CONTENT ' ' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(CONTENT 'foo' AS text CANONICAL);
+ERROR: invalid XML document
+SELECT xmlserialize(DOCUMENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+ERROR: syntax error at or near "INDENT" at character 75
+SELECT xmlserialize(CONTENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+ERROR: syntax error at or near "INDENT" at character 74
+\set VERBOSITY default
SELECT xml '<foo>bar</foo>' IS DOCUMENT;
?column?
----------
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index a591eea2e5..a3eda9e84f 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -168,6 +168,69 @@ SELECT xmlserialize(CONTENT '<foo><bar></bar></foo>' AS text INDENT);
-- 'no indent' = not using 'no indent'
SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+-- xmlserialize: canonical
+CREATE TABLE xmltest_serialize (id int, doc xml);
+INSERT INTO xmltest_serialize VALUES
+ (1,'<?xml version="1.0" encoding="ISO-8859-1"?>
+ <!DOCTYPE doc SYSTEM "doc.dtd" [
+ <!ENTITY val "42">
+ <!ATTLIST xyz attr CDATA "default">
+ ]>
+
+ <!-- attributes and namespces will be sorted -->
+ <foo a:attr="out" b:attr="sorted" attr2="all" attr="I am"
+ xmlns:b="http://www.ietf.org"
+ xmlns:a="http://www.w3.org"
+ xmlns="http://example.org">
+
+ <!-- Normalization of whitespace in start and end tags -->
+ <!-- Elimination of superfluous namespace declarations, as already declared in <foo> -->
+ <bar xmlns="" xmlns:a="http://www.w3.org" >&val;</bar >
+
+ <!-- empty element will be converted to start-end tag pair -->
+ <empty/>
+
+ <!-- text will be transcoded to UTF-8 -->
+ <transcode>1</transcode>
+
+ <!-- default attribute will be added -->
+ <!-- whitespace inside tag will be preserved -->
+ <whitespace> 321 </whitespace>
+
+ <!-- empty namespace will be removed of child tag -->
+ <emptyns xmlns="" >
+ <emptyns_child xmlns=""></emptyns_child>
+ </emptyns>
+
+ <!-- CDATA section will be replaced by its value -->
+ <compute><![CDATA[value>"0" && value<"10" ?"valid":"error"]]></compute>
+ </foo>
+ <!-- comment outside doc -->'::xml),
+ (2,'<foo>
+ <bar>
+ <!-- important comment -->
+ <val x="y">42</val>
+ </bar>
+ </foo> '::xml);
+
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+SELECT xmlserialize(DOCUMENT doc AS text CANONICAL) = xmlserialize(DOCUMENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) FROM xmltest_serialize WHERE id = 1;
+SELECT xmlserialize(CONTENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize WHERE id = 2;
+SELECT xmlserialize(CONTENT doc AS text CANONICAL) = xmlserialize(CONTENT doc AS text CANONICAL WITH NO COMMENTS) FROM xmltest_serialize;
+SELECT xmlserialize(DOCUMENT NULL AS text CANONICAL);
+SELECT xmlserialize(CONTENT NULL AS text CANONICAL);
+\set VERBOSITY terse
+SELECT xmlserialize(DOCUMENT '' AS text CANONICAL);
+SELECT xmlserialize(DOCUMENT ' ' AS text CANONICAL);
+SELECT xmlserialize(DOCUMENT 'foo' AS text CANONICAL);
+SELECT xmlserialize(CONTENT '' AS text CANONICAL);
+SELECT xmlserialize(CONTENT ' ' AS text CANONICAL);
+SELECT xmlserialize(CONTENT 'foo' AS text CANONICAL);
+SELECT xmlserialize(DOCUMENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+SELECT xmlserialize(CONTENT '<foo><bar>73</bar></foo>' AS text CANONICAL INDENT);
+\set VERBOSITY default
SELECT xml '<foo>bar</foo>' IS DOCUMENT;
SELECT xml '<foo>bar</foo><bar>foo</bar>' IS DOCUMENT;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 8de90c4958..ed48d57c99 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3113,6 +3113,7 @@ XmlExpr
XmlExprOp
XmlOptionType
XmlSerialize
+XmlSerializeFormat
XmlTableBuilderData
YYLTYPE
YYSTYPE
--
2.34.1