Hi,
While testing a feature reported by Pavel in this thread[1] I realized
that elements containing whitespaces between them won't be indented with
XMLSERIALIZE( ... INDENT)
SELECT xmlserialize(DOCUMENT '<foo><bar>42</bar></foo>' AS text INDENT);
xmlserialize
-----------------
<foo> +
<bar>42</bar>+
</foo> +
(1 row)
SELECT xmlserialize(DOCUMENT '<foo> <bar>42</bar> </foo>'::xml AS text
INDENT);
xmlserialize
----------------------------
<foo> <bar>42</bar> </foo>+
(1 row)
Other products have a different approach[2]
Perhaps simply setting xmltotext_with_options' parameter "perserve_whitespace"
to false when XMLSERIALIZE(.. INDENT) would do the trick.
doc = xml_parse(data, xmloption_arg, !indent ? true : false,
GetDatabaseEncoding(),
&parsed_xmloptiontype, &content_nodes,
(Node *) &escontext);
(diff attached)
SELECT xmlserialize(DOCUMENT '<foo> <bar>42</bar> </foo>'::xml AS text
INDENT);
xmlserialize
-----------------
<foo> +
<bar>42</bar>+
</foo> +
(1 row)
If this is indeed the way to go I can update the regression tests accordingly.
Best,
--
Jim
1 -
https://www.postgresql.org/message-id/cbd68a31-9776-4742-9c09-4344a4c5e6dc%40uni-muenster.de
2 - https://dbfiddle.uk/zdKnfsqX
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 447e72b21e..1cd4929870 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -677,8 +677,14 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent)
}
#ifdef USE_LIBXML
- /* Parse the input according to the xmloption */
- doc = xml_parse(data, xmloption_arg, true, GetDatabaseEncoding(),
+ /*
+ * Parse the input according to the xmloption
+ * preserve_whitespace is set to false in case the function should
+ * return an indented xml, otherwise libxml2 will ignore the elements
+ * that contain whitespaces between them.
+ */
+ doc = xml_parse(data, xmloption_arg, !indent ? true : false,
+ GetDatabaseEncoding(),
&parsed_xmloptiontype, &content_nodes,
(Node *) &escontext);
if (doc == NULL || escontext.error_occurred)