Hi 2017-11-06 14:00 GMT+01:00 Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp>:
> Thank you for the new patch. > > - The latest patch is missing xpath_parser.h at least since > ns-3. That of the first (not-numbered) version was still > usable. > > - c29c578 conflicts on doc/src/sgml/func.sgml > > > At Sun, 15 Oct 2017 12:06:11 +0200, Pavel Stehule <pavel.steh...@gmail.com> > wrote in <CAFj8pRCYBH+a6oJoEYUFDUpBQ1ySwtt2CfnFZxs2A > b9efon...@mail.gmail.com> > > 2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI < > > horiguchi.kyot...@lab.ntt.co.jp>: > > > > > Hi, thanks for the new patch. > > > > > > # The patch is missing xpath_parser.h. That of the first patch was > usable. > > > > > > At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule < > pavel.steh...@gmail.com> > > > wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@ > > > mail.gmail.com> > > > > Hi > > > > > > > > now xpath and xpath_exists supports default namespace too > > > > > > At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule < > pavel.steh...@gmail.com> > > > wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail. > > > gmail.com> > > > > > 1. Uniformity among simliar features > > > > > > > > > > As mentioned in the proposal, but it is lack of uniformity that > > > > > the xpath transformer is applied only to xmltable and not for > > > > > other xpath related functions. > > > > > > > > > > > > > I have to fix the XPath function. The SQL/XML function Xmlexists > doesn't > > > > support namespaces/ > > > > > > Sorry, I forgot to care about that. (And the definition of > > > namespace array is of course fabricated by me). I'd like to leave > > > this to committers. Anyway it is working but the syntax (or > > > whether it is acceptable) is still arguable. > > > > > > SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com"> > > > test</my:a>', > > > ARRAY[ARRAY['', 'http://example.com']]); > > > | xpath > > > | -------- > > > | {test} > > > | (1 row) > > > > > > > > > The internal name is properly rejected, but the current internal > > > name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We > > > are preserving some short names and reject them as > > > user-defined. Doesn't just 'pgsqlxml' work? > > > > > > > > > Default namespace correctly become to be applied on bare > > > attribute names. > > > > > > > updated doc, > > > > fixed all variants of expected result test file > > > > > > Sorry for one by one comment but I found another misbehavior. > > > > > > create table t1 (id int, doc xml); > > > insert into t1 > > > values > > > (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></ > > > rows>'); > > > select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x), > > > '/x:rows/x:row' passing t1.doc columns data int PATH > > > 'child::x:a[1][attribute::hoge="haha"]') as x; > > > | data > > > | ------ > > > | 50 > > > > > > but the following fails. > > > > > > select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), > > > '/rows/row' passing t1.doc columns data int PATH > > > 'child::a[1][attribute::hoge="haha"]') as x; > > > | data > > > | ------ > > > | > > > | (1 row) > > > > > > Perhaps child::a is not prefixed by the transformation. > > > > > > > the problem was in unwanted attribute modification. The parser didn't > > detect "attribute::hoge" as attribute. Updated parser does it. I reduce > > duplicated code there more. > > It worked as expected. But the comparison of "attribute" is > missing t1.length = 9 so the following expression wrongly passes. > > child::a[1][attributeabcdefg::hoge="haha" > > It is confusing that is_qual_name becomes true when t2 is not a > "qual name", and the way it treats a double-colon is hard to > understand. > > It essentially does inserting the default namespace before > unqualified non-attribute name. I believe we can easily > look-ahead to detect a double colon and it would make things > simpler. Could you consider something like the attached patch? > (applies on top of ns-4 patch.) > > > > XPath might be complex enough so that it's worth switching to > > > yacc/lex based transformer that is formally verifiable and won't > > > need a bunch of cryptic tests that finally cannot prove the > > > completeness. synchronous_standy_names is far simpler than XPath > > > but using yacc/lex parser. > > > > > > > > > Anyway the following is nitpicking of the current xpath_parser.c. > > > > > > - NODENAME_FIRSTCHAR allows '-' as the first char but it is > > > excluded from NameStartChar (https://www.w3.org/TR/REC- > > > xml/#NT-NameStartChar) > > > I think characters with high-bit set is okay. > > > Also IS_NODENAME_CHAR should be changed. > > > > > > > fixed > > > > > > > - NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category > > > but have different naming schemes. Can these are named in the same > way? > > > > > > > fixed > > > > > > > - The current transoformer seems to using up to one token stack > > > depth. Maybe the stack is needless. (pushed token is always > > > popped just after) > > > > > > > fixed > > Thank you. > > I found another (and should be the last, so sorry..) functional > defect in this. This doesn't add default namespace if the tag > name in a predicate is 'and' or 'or'. It needs to be fixed, or > wrote in the documentation as a restriction. (seem hard to fix > it..) > > create table t1 (id int, doc xml); > insert into t1 values (1, '<rows xmlns="http://x.y"><row><val> > 50</val><and>60</and></row></rows>'); > select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x), > '/x:rows/x:row' passing t1.doc columns data int PATH > 'x:val[../x:and = 60]') as x; > data > ------ > 50 > (1 row) > select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), > '/rows/row' passing t1.doc columns data int PATH > 'val[../and = 60]') as x; > data > ------ > > (1 row) > > yes - this check needs context parser. I am expecting, this case is corner case, not too much usual, so doc based solution is enough. > > > Other comments are follows. > > - Please add more comments. XPATH_TOKEN_NAME in _transformXPath > in my patch has more > > - Debug output might be needed. > > # sorry now time's up. will continue tomorrow. > I fixed I hope almost all issues - your patch is merged with some changes. The most significant change is a reaction to broken XPath expression. I prefer do nothing - libxml2 raise a error. Attached new version. Thank you for tips, ideas, code :) > regards, > > -- > Kyotaro Horiguchi > NTT Open Source Software Center >
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index f901567f7e..76424efa31 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -10468,7 +10468,8 @@ SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuf second the namespace URI. It is not required that aliases provided in this array be the same as those being used in the XML document itself (in other words, both in the XML document and in the <function>xpath</function> - function context, aliases are <emphasis>local</emphasis>). + function context, aliases are <emphasis>local</emphasis>). Default namespace has + empty name (empty string) and should be only one. </para> <para> @@ -10484,11 +10485,20 @@ SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>', ]]></screen> </para> + <para> + Inside predicate literals <literal>and</literal>, <literal>or</literal>, + <literal>div</literal> and <literal>mod</literal> are used as keywords + (XPath operators) every time and default namespace are not applied there. + If you would to use these literals like tag names, then the default namespace + should not be used, and these literals should be explicitly + labeled. + </para> + <para> To deal with default (anonymous) namespaces, do something like this: <screen><![CDATA[ -SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>', - ARRAY[ARRAY['mydefns', 'http://example.com']]); +SELECT xpath('//b/text()', '<a xmlns="http://example.com"><b>test</b></a>', + ARRAY[ARRAY['', 'http://example.com']]); xpath -------- @@ -10562,8 +10572,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m <para> The optional <literal>XMLNAMESPACES</literal> clause is a comma-separated list of namespaces. It specifies the XML namespaces used in - the document and their aliases. A default namespace specification - is not currently supported. + the document and their aliases. </para> <para> diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 1fb018416e..b60a3cfe0d 100644 --- a/src/backend/utils/adt/Makefile +++ b/src/backend/utils/adt/Makefile @@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \ tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \ tsvector.o tsvector_op.o tsvector_parser.o \ txid.o uuid.o varbit.o varchar.o varlena.o version.o \ - windowfuncs.o xid.o xml.o + windowfuncs.o xid.o xml.o xpath_parser.o like.o: like.c like_match.c diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index c9d07f2ae9..8c7f37df4c 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -90,7 +90,7 @@ #include "utils/rel.h" #include "utils/syscache.h" #include "utils/xml.h" - +#include "utils/xpath_parser.h" /* GUC variables */ int xmlbinary; @@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData xmlXPathCompExprPtr xpathcomp; xmlXPathObjectPtr xpathobj; xmlXPathCompExprPtr *xpathscomp; + bool with_default_ns; } XmlTableBuilderData; #endif @@ -227,6 +228,7 @@ const TableFuncRoutine XmlTableRoutine = #define NAMESPACE_XSI "http://www.w3.org/2001/XMLSchema-instance" #define NAMESPACE_SQLXML "http://standards.iso.org/iso/9075/2003/sqlxml" +#define DEFAULT_NAMESPACE_NAME "pgdefnamespace.pgsqlxml.internal" #ifdef USE_LIBXML @@ -3849,6 +3851,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces, int ndim; Datum *ns_names_uris; bool *ns_names_uris_nulls; + bool with_default_ns = false; int ns_count; /* @@ -3898,7 +3901,6 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces, errmsg("empty XPath expression"))); string = pg_xmlCharStrndup(datastr, len); - xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len); xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL); @@ -3941,6 +3943,26 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), errmsg("neither namespace name nor URI may be null"))); ns_name = TextDatumGetCString(ns_names_uris[i * 2]); + + /* Don't allow same namespace as out internal default namespace name */ + if (strcmp(ns_name, DEFAULT_NAMESPACE_NAME) == 0) + ereport(ERROR, + (errcode(ERRCODE_RESERVED_NAME), + errmsg("cannot to use \"%s\" as namespace name", + DEFAULT_NAMESPACE_NAME), + errdetail("\"%s\" is reserved for internal purpose", + DEFAULT_NAMESPACE_NAME))); + if (*ns_name == '\0') + { + if (with_default_ns) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("only one default namespace is allowed"))); + + with_default_ns = true; + ns_name = DEFAULT_NAMESPACE_NAME; + } + ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]); if (xmlXPathRegisterNs(xpathctx, (xmlChar *) ns_name, @@ -3951,6 +3973,16 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces, } } + if (with_default_ns) + { + StringInfoData str; + + transformXPath(&str, text_to_cstring(xpath_expr_text), DEFAULT_NAMESPACE_NAME); + xpath_expr = pg_xmlCharStrndup(str.data, str.len); + } + else + xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len); + xpathcomp = xmlXPathCompile(xpath_expr); if (xpathcomp == NULL || xmlerrcxt->err_occurred) xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR, @@ -4195,6 +4227,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts) xtCxt->magic = XMLTABLE_CONTEXT_MAGIC; xtCxt->natts = natts; xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts); + xtCxt->with_default_ns = false; xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL); @@ -4287,6 +4320,7 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value) #endif /* not USE_LIBXML */ } + /* * XmlTableSetNamespace * Add a namespace declaration @@ -4297,12 +4331,25 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri) #ifdef USE_LIBXML XmlTableBuilderData *xtCxt; - if (name == NULL) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("DEFAULT namespace is not supported"))); xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace"); + if (name != NULL) + { + /* Don't allow same namespace as out internal default namespace name */ + if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0) + ereport(ERROR, + (errcode(ERRCODE_RESERVED_NAME), + errmsg("cannot to use \"%s\" as namespace name", + DEFAULT_NAMESPACE_NAME), + errdetail("\"%s\" is reserved for internal purpose", + DEFAULT_NAMESPACE_NAME))); + } + else + { + xtCxt->with_default_ns = true; + name = DEFAULT_NAMESPACE_NAME; + } + if (xmlXPathRegisterNs(xtCxt->xpathcxt, pg_xmlCharStrndup(name, strlen(name)), pg_xmlCharStrndup(uri, strlen(uri)))) @@ -4331,6 +4378,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path) (errcode(ERRCODE_DATA_EXCEPTION), errmsg("row path filter must not be empty string"))); + if (xtCxt->with_default_ns) + { + StringInfoData str; + + transformXPath(&str, path, DEFAULT_NAMESPACE_NAME); + path = str.data; + } + xstr = pg_xmlCharStrndup(path, strlen(path)); xtCxt->xpathcomp = xmlXPathCompile(xstr); @@ -4362,6 +4417,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum) (errcode(ERRCODE_DATA_EXCEPTION), errmsg("column path filter must not be empty string"))); + if (xtCxt->with_default_ns) + { + StringInfoData str; + + transformXPath(&str, path, DEFAULT_NAMESPACE_NAME); + path = str.data; + } + xstr = pg_xmlCharStrndup(path, strlen(path)); xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr); diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c new file mode 100644 index 0000000000..35441a646c --- /dev/null +++ b/src/backend/utils/adt/xpath_parser.c @@ -0,0 +1,361 @@ +/*------------------------------------------------------------------------- + * + * xpath_parser.c + * XML XPath parser. + * + * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/backend/utils/adt/xpath_parser.c + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "utils/xpath_parser.h" + +/* + * All PostgreSQL XML related functionality is based on libxml2 library, and + * XPath support is not an exception. However, libxml2 doesn't support + * default namespace for XPath expressions. Because there are not any API + * how to transform or access to parsed XPath expression we have to parse + * XPath here. + * + * Those functionalities are implemented with a simple XPath parser/ + * preprocessor. This XPath parser transforms a XPath expression to another + * XPath expression that can be used by libxml2 XPath evaluation. It doesn't + * replace libxml2 XPath parser or libxml2 XPath expression evaluation. + */ + +#ifdef USE_LIBXML + +/* + * We need to work with XPath expression tokens. When expression starting with + * nodename, then we can use prefix. When default namespace is defined, then we + * should to enhance any nodename and attribute without namespace by default + * namespace. + */ + +typedef enum +{ + XPATH_TOKEN_NONE, + XPATH_TOKEN_NAME, + XPATH_TOKEN_STRING, + XPATH_TOKEN_NUMBER, + XPATH_TOKEN_COLON, + XPATH_TOKEN_DCOLON, + XPATH_TOKEN_OTHER +} XPathTokenType; + +typedef struct XPathTokenInfo +{ + XPathTokenType ttype; + char *start; + int length; +} XPathTokenInfo; + +typedef struct ParserData +{ + char *str; + char *cur; + XPathTokenInfo buffer; + bool buffer_is_empty; +} XPathParserData; + +/* Any high-bit-set character is OK (might be part of a multibyte char) */ +#define IS_NODENAME_FIRSTCHAR(c) ((c) == '_' || \ + ((c) >= 'A' && (c) <= 'Z') || \ + ((c) >= 'a' && (c) <= 'z') || \ + (IS_HIGHBIT_SET(c))) + +#define IS_NODENAME_CHAR(c) (IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \ + ((c) >= '0' && (c) <= '9')) + +#define TOKEN_IS_EMPTY(t) ((t).ttype == XPATH_TOKEN_NONE) + +/* + * Returns next char after last char of token - XPath lexer + */ +static char * +getXPathToken(char *str, XPathTokenInfo * ti) +{ + /* skip initial spaces */ + while (*str == ' ') + str++; + + if (*str != '\0') + { + char c = *str; + + ti->start = str++; + + if (c >= '0' && c <= '9') + { + while (*str >= '0' && *str <= '9') + str++; + if (*str == '.') + { + str++; + while (*str >= '0' && *str <= '9') + str++; + } + ti->ttype = XPATH_TOKEN_NUMBER; + } + else if (IS_NODENAME_FIRSTCHAR(c)) + { + while (IS_NODENAME_CHAR(*str)) + str++; + + ti->ttype = XPATH_TOKEN_NAME; + } + else if (c == '"') + { + while (*str != '\0') + if (*str++ == '"') + break; + + ti->ttype = XPATH_TOKEN_STRING; + } + else if (c == ':') + { + /* look ahead to detect a double-colon */ + if (*str == ':') + { + ti->ttype = XPATH_TOKEN_DCOLON; + str++; + } + else + ti->ttype = XPATH_TOKEN_COLON; + } + else + ti->ttype = XPATH_TOKEN_OTHER; + + ti->length = str - ti->start; + } + else + { + ti->start = NULL; + ti->length = 0; + + ti->ttype = XPATH_TOKEN_NONE; + } + + return str; +} + +/* + * reset XPath parser stack + */ +static void +initXPathParser(XPathParserData * parser, char *str) +{ + parser->str = str; + parser->cur = str; + parser->buffer_is_empty = true; +} + +/* + * Returns token from stack or read token + */ +static void +nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti) +{ + if (!parser->buffer_is_empty) + { + memcpy(ti, &parser->buffer, sizeof(XPathTokenInfo)); + parser->buffer_is_empty = true; + } + else + parser->cur = getXPathToken(parser->cur, ti); +} + +/* + * Push token to stack + */ +static void +pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti) +{ + if (!parser->buffer_is_empty) + elog(ERROR, "internal error"); + + memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo)); + parser->buffer_is_empty = false; + ti->ttype = XPATH_TOKEN_NONE; +} + +/* + * Write token to output string + */ +static void +writeXPathToken(StringInfo str, XPathTokenInfo * ti) +{ + Assert(ti->ttype != XPATH_TOKEN_NONE); + + if (ti->ttype != XPATH_TOKEN_OTHER) + appendBinaryStringInfo(str, ti->start, ti->length); + else + appendStringInfoChar(str, *ti->start); + + ti->ttype = XPATH_TOKEN_NONE; +} + +/* + * This is main part of XPath transformation. It can be called recursivly, + * when XPath expression contains predicates. + */ +static void +_transformXPath(StringInfo str, XPathParserData * parser, + bool inside_predicate, + char *def_namespace_name) +{ + XPathTokenInfo t1, + t2; + bool tagname_needs_defnsp; + bool token_is_tagattrib = false; + + nextXPathToken(parser, &t1); + + while (t1.ttype != XPATH_TOKEN_NONE) + { + switch (t1.ttype) + { + case XPATH_TOKEN_NUMBER: + case XPATH_TOKEN_STRING: + case XPATH_TOKEN_COLON: + case XPATH_TOKEN_DCOLON: + /* write without any changes */ + writeXPathToken(str, &t1); + /* process fresh token */ + nextXPathToken(parser, &t1); + break; + + case XPATH_TOKEN_NAME: + { + /* + * Inside predicate ignore keywords (literal operators) + * "and" "or" "div" and "mod". + */ + if (inside_predicate) + { + if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) || + (strncmp(t1.start, "or", 2) == 0 && t1.length == 2) || + (strncmp(t1.start, "div", 3) == 0 && t1.length == 3) || + (strncmp(t1.start, "mod", 3) == 0 && t1.length == 3)) + { + token_is_tagattrib = false; + + /* keyword */ + writeXPathToken(str, &t1); + /* process fresh token */ + nextXPathToken(parser, &t1); + break; + } + } + + tagname_needs_defnsp = true; + + nextXPathToken(parser, &t2); + if (t2.ttype == XPATH_TOKEN_COLON) + { + /* t1 is a quilified node name. no need to add default one. */ + tagname_needs_defnsp = false; + + /* namespace name */ + writeXPathToken(str, &t1); + /* colon */ + writeXPathToken(str, &t2); + /* get node name */ + nextXPathToken(parser, &t1); + } + else if (t2.ttype == XPATH_TOKEN_DCOLON) + { + /* t1 is an axis name. write out as it is */ + if (strncmp(t1.start, "attribute", 9) == 0 && t1.length == 9) + token_is_tagattrib = true; + + /* axis name */ + writeXPathToken(str, &t1); + /* double colon */ + writeXPathToken(str, &t2); + + /* + * The next token may be qualified tag name, process + * it as a fresh token. + */ + nextXPathToken(parser, &t1); + break; + } + else if (t2.ttype == XPATH_TOKEN_OTHER) + { + /* function name doesn't require namespace */ + if (*t2.start == '(') + tagname_needs_defnsp = false; + else + pushXPathToken(parser, &t2); + } + + if (tagname_needs_defnsp && !token_is_tagattrib) + appendStringInfo(str, "%s:", def_namespace_name); + + token_is_tagattrib = false; + + /* write maybe-tagname if not consumed yet */ + if (!TOKEN_IS_EMPTY(t1)) + writeXPathToken(str, &t1); + + /* output t2 if not consumed yet */ + if (!TOKEN_IS_EMPTY(t2)) + writeXPathToken(str, &t2); + + nextXPathToken(parser, &t1); + } + break; + + case XPATH_TOKEN_OTHER: + { + char c = *t1.start; + + writeXPathToken(str, &t1); + + if (c == '[') + _transformXPath(str, parser, true, def_namespace_name); + else + { + if (c == ']' && inside_predicate) + { + return; + } + else if (c == '@') + { + nextXPathToken(parser, &t1); + if (t1.ttype == XPATH_TOKEN_NAME) + token_is_tagattrib = true; + + pushXPathToken(parser, &t1); + } + } + nextXPathToken(parser, &t1); + } + break; + + case XPATH_TOKEN_NONE: + elog(ERROR, "should not be here"); + } + } +} + +void +transformXPath(StringInfo str, char *xpath, + char *def_namespace_name) +{ + XPathParserData parser; + + Assert(def_namespace_name != NULL); + + initStringInfo(str); + initXPathParser(&parser, xpath); + _transformXPath(str, &parser, false, def_namespace_name); + + elog(DEBUG1, "apply default namespace \"%s\"", str->data); +} + +#endif diff --git a/src/include/utils/xpath_parser.h b/src/include/utils/xpath_parser.h new file mode 100644 index 0000000000..b2fc239e12 --- /dev/null +++ b/src/include/utils/xpath_parser.h @@ -0,0 +1,23 @@ +/*------------------------------------------------------------------------- + * + * xpath_parser.h + * Declarations for XML XPath transformation. + * + * + * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/xml.h + * + *------------------------------------------------------------------------- + */ + +#ifndef XPATH_PARSER_H +#define XPATH_PARSER_H + +#include "postgres.h" +#include "lib/stringinfo.h" + +void transformXPath(StringInfo str, char *xpath, char *def_namespace_name); + +#endif /* XPATH_PARSER_H */ diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out index bcc585d427..63e04f1353 100644 --- a/src/test/regress/expected/xml.out +++ b/src/test/regress/expected/xml.out @@ -1085,7 +1085,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>' COLUMNS a int PATH 'a'); -ERROR: DEFAULT namespace is not supported + a +---- + 10 +(1 row) + -- used in prepare statements PREPARE pp AS SELECT xmltable.* @@ -1452,3 +1456,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c 14 (4 rows) +-- default namespaces +CREATE TABLE t1 (id int, doc xml); +INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>'); +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; + data +------ + 50 +(1 row) + +SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x; + data +------ + 50 +(1 row) + +SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x; + data +------ + 50 +(1 row) + +-- should fail +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; +ERROR: cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name +DETAIL: "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose +-- xpath and xpath_exists supports namespaces too +SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); + xpath +-------------------------------------------------- + {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"} +(1 row) + +SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); + xpath +-------------------------------------------------- + {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"} +(1 row) + +SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); + xpath_exists +-------------- + t +(1 row) + +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); + xpath_exists +-------------- + t +(1 row) + +-- should fail +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]); +ERROR: only one default namespace is allowed diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out index d3bd8c91d7..58f9151788 100644 --- a/src/test/regress/expected/xml_1.out +++ b/src/test/regress/expected/xml_1.out @@ -1302,3 +1302,59 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c --- (0 rows) +-- default namespaces +CREATE TABLE t1 (id int, doc xml); +INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>'); +ERROR: unsupported XML feature +LINE 1: INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a ... + ^ +DETAIL: This functionality requires the server to be built with libxml support. +HINT: You need to rebuild PostgreSQL using --with-libxml. +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; + data +------ +(0 rows) + +SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x; + data +------ +(0 rows) + +-- should fail +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; + data +------ +(0 rows) + +-- xpath and xpath_exists supports namespaces too +SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); +ERROR: unsupported XML feature +LINE 1: SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="ht... + ^ +DETAIL: This functionality requires the server to be built with libxml support. +HINT: You need to rebuild PostgreSQL using --with-libxml. +SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); +ERROR: unsupported XML feature +LINE 1: SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x... + ^ +DETAIL: This functionality requires the server to be built with libxml support. +HINT: You need to rebuild PostgreSQL using --with-libxml. +SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); +ERROR: unsupported XML feature +LINE 1: ...ELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xml... + ^ +DETAIL: This functionality requires the server to be built with libxml support. +HINT: You need to rebuild PostgreSQL using --with-libxml. +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); +ERROR: unsupported XML feature +LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h... + ^ +DETAIL: This functionality requires the server to be built with libxml support. +HINT: You need to rebuild PostgreSQL using --with-libxml. +-- should fail +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]); +ERROR: unsupported XML feature +LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h... + ^ +DETAIL: This functionality requires the server to be built with libxml support. +HINT: You need to rebuild PostgreSQL using --with-libxml. diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out index ff77132803..c92a09e5a9 100644 --- a/src/test/regress/expected/xml_2.out +++ b/src/test/regress/expected/xml_2.out @@ -1065,7 +1065,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>' COLUMNS a int PATH 'a'); -ERROR: DEFAULT namespace is not supported + a +---- + 10 +(1 row) + -- used in prepare statements PREPARE pp AS SELECT xmltable.* @@ -1432,3 +1436,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c 14 (4 rows) +-- default namespaces +CREATE TABLE t1 (id int, doc xml); +INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>'); +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; + data +------ + 50 +(1 row) + +SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x; + data +------ + 50 +(1 row) + +SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x; + data +------ + 50 +(1 row) + +-- should fail +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; +ERROR: cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name +DETAIL: "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose +-- xpath and xpath_exists supports namespaces too +SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); + xpath +-------------------------------------------------- + {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"} +(1 row) + +SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); + xpath +-------------------------------------------------- + {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"} +(1 row) + +SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); + xpath_exists +-------------- + t +(1 row) + +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); + xpath_exists +-------------- + t +(1 row) + +-- should fail +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]); +ERROR: only one default namespace is allowed diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql index eb4687fb09..e8cff5f22d 100644 --- a/src/test/regress/sql/xml.sql +++ b/src/test/regress/sql/xml.sql @@ -558,3 +558,23 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D'); SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c'); SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.'); SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54); + +-- default namespaces +CREATE TABLE t1 (id int, doc xml); +INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>'); + +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; +SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x; +SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x; + +-- should fail +SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x; + +-- xpath and xpath_exists supports namespaces too +SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); +SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); +SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]); +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]); + +-- should fail +SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers