Re: [JDBC] [HACKERS] JDBC connections to 9.1
On 18/04/11 15:57, Tom Lane wrote: Bernd Helmlemaili...@oopsware.de writes: If i am reading it correct, it reads UTF8 from the backend, while expecting UNICODE only. Not sure what change has caused this, though. I am --- when I redid the GUC assign_hook logic a few weeks ago, I changed the client_encoding code so that it shows the normalized (official) name of the encoding, not whatever random string the client sent over. For instance, previous versions: regression=# set client_encoding = 'UnIcOdE'; SET regression=# show client_encoding ; client_encoding - UnIcOdE (1 row) versus HEAD: regression=# set client_encoding = 'UnIcOdE'; SET regression=# show client_encoding ; client_encoding - UTF8 (1 row) I wasn't aware that JDBC would fail on that. It's pretty annoying that it does, but maybe we should grin and bear it, ie revert the change to canonicalize the GUC's value? regards, tom lane Am I right in thinking that would be that change committed on the 7th (http://archives.postgresql.org/pgsql-committers/2011-04/msg00039.php) ? I've just run the JDBC test build on my machine and it fails dismally with this very message repeated over and over again. What concerns me most is that (assuming my dates are right) the JDBC driver has been broken for 11 days and no one noticed. This would lead me to believe that there is no JDBC build server. What would it take to set one up? If someone can point me to a test machine I'd happily assist in setting one up. As for the breakage itself I'm OK with a new driver version for a new database version and from my experience people expect that. I recall a number of people asking me if an 8.4 driver would be OK to use against 9 before the 9 version was stable. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [JDBC] [HACKERS] JDBC connections to 9.1
On 18/04/11 17:35, Andrew Dunstan wrote: On 04/18/2011 11:25 AM, Tom Lane wrote: What concerns me most is that (assuming my dates are right) the JDBC driver has been broken for 11 days and no one noticed. This would lead me to believe that there is no JDBC build server. What would it take to set one up? +1 for doing something along that line. All you'd need to do is write a step for a buildfarm animal to fetch the JDBC driver and run some tests, and run it in a buildfarm client somewhere. The server code is quite agnostic about the steps that are reported on. IOW in addition to a running buildfarm member you need to write a small amount ( 100 lines, possibly much less) of perl code. cheers andrew I've found the entry on the Developer Wiki (http://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto). What I'll do is set-up three farms on my machine - one for 1.4, one for 1.5 and one for 1.6. It's been a while since I've had an excuse to write some Perl! I can't guarantee when I'll have it done as I'm away for a little over a week from Wednesday and I'm not allowed internet access! Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [JDBC] [HACKERS] JDBC connections to 9.1
On 18/04/11 17:12, Tom Lane wrote: Dave Cramerp...@fastcrypt.com writes: Well initially my concern was that people would have a challenge in the case where they had to re-certify their application if we made this change, however I realize they will have to do this anyway since upgrading to 9.1 is what necessitates it. I don't see any backwards compatibility risk, if that's what you mean. Every backend release since 7.3 has treated client_encoding 'UTF8' and 'UNICODE' the same, and earlier releases didn't accept either one. regards, tom lane As there seems to be a consensus forming for fixing the JDBC driver, I've taken the liberty do so at the risk of being shot down. The patch is fairly straightforward, just changing UNICODE to UTF8 in a number of files including the translation files. I've tested this against 9.1devel (HEAD) and 8.4.7. For each database version I build and the tests running JDKs 1.4.2_19, 1.5.0_22 and 1.6.0_2. All on 32-bit. Regards, -- Mike Fowler Registered Linux user: 379787 Index: doc/pgjdbc.xml === RCS file: /cvsroot/jdbc/pgjdbc/doc/pgjdbc.xml,v retrieving revision 1.40 diff -c -r1.40 pgjdbc.xml *** doc/pgjdbc.xml 25 Dec 2010 07:07:44 - 1.40 --- doc/pgjdbc.xml 18 Apr 2011 16:32:49 - *** *** 179,185 encoding and you will have problems the moment you store data in it that does not fit in the seven bit acronymASCII/acronym character set. If you do not know what your encoding will be or are otherwise unsure ! about what you will be storing the literalUNICODE/literal encoding is a reasonable default to use. /para /sect1 --- 179,185 encoding and you will have problems the moment you store data in it that does not fit in the seven bit acronymASCII/acronym character set. If you do not know what your encoding will be or are otherwise unsure ! about what you will be storing the literalUTF8/literal encoding is a reasonable default to use. /para /sect1 Index: org/postgresql/core/BaseConnection.java === RCS file: /cvsroot/jdbc/pgjdbc/org/postgresql/core/BaseConnection.java,v retrieving revision 1.23 diff -c -r1.23 BaseConnection.java *** org/postgresql/core/BaseConnection.java 1 May 2010 14:40:51 - 1.23 --- org/postgresql/core/BaseConnection.java 18 Apr 2011 16:32:49 - *** *** 96,102 /** * Encode a string using the database's client_encoding ! * (usually UNICODE, but can vary on older server versions). * This is used when constructing synthetic resultsets (for * example, in metadata methods). * --- 96,102 /** * Encode a string using the database's client_encoding ! * (usually UTF8, but can vary on older server versions). * This is used when constructing synthetic resultsets (for * example, in metadata methods). * Index: org/postgresql/core/v2/ConnectionFactoryImpl.java === RCS file: /cvsroot/jdbc/pgjdbc/org/postgresql/core/v2/ConnectionFactoryImpl.java,v retrieving revision 1.21 diff -c -r1.21 ConnectionFactoryImpl.java *** org/postgresql/core/v2/ConnectionFactoryImpl.java 31 Mar 2011 03:06:38 - 1.21 --- org/postgresql/core/v2/ConnectionFactoryImpl.java 18 Apr 2011 16:32:49 - *** *** 380,388 // 7.3 server that defaults to autocommit = off. if (logger.logDebug()) ! logger.debug(Switching to UNICODE client_encoding); ! String sql = begin; set autocommit = on; set client_encoding = 'UNICODE'; ; if (dbVersion.compareTo(9.0) = 0) { sql += SET extra_float_digits=3; ; } else if (dbVersion.compareTo(7.4) = 0) { --- 380,388 // 7.3 server that defaults to autocommit = off. if (logger.logDebug()) ! logger.debug(Switching to UTF8 client_encoding); ! String sql = begin; set autocommit = on; set client_encoding = 'UTF8'; ; if (dbVersion.compareTo(9.0) = 0) { sql += SET extra_float_digits=3; ; } else if (dbVersion.compareTo(7.4) = 0) { *** *** 391,397 sql += commit; SetupQueryRunner.run(protoConnection, sql, false); ! protoConnection.setEncoding(Encoding.getDatabaseEncoding(UNICODE)); } else { --- 391,397 sql += commit; SetupQueryRunner.run(protoConnection, sql, false); ! protoConnection.setEncoding(Encoding.getDatabaseEncoding(UTF8)); } else { Index: org/postgresql/core/v3/ConnectionFactoryImpl.java
Re: [HACKERS] PG 9.0.3. How to select rows from xml
Hi, On 02/04/11 20:47, emanov wrote: Hi all! What i need is transform xml document to table like that: insert into tmp(Name, Value) select t.Name, t.Value from myxml.Nodes(/doc/person) as t('Name:varchar|Value:int') or similar. In fact I have many rows with many columns. How I can do it with PG 9.0.3 where I can't find xpath_table? Though deprecated, the xml2 contrib module with xpath_table is still present. Installation instructions can be found on http://www.postgresql.org/docs/9.0/static/contrib.html and the module is documented at http://www.postgresql.org/docs/9.0/static/xml2.html. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new keywords in 9.1
On 12/03/11 05:18, Robert Haas wrote: XMLEXISTS is pretty horrible in that the syntax apparently requires three new keywords (XMLEXISTS, PASSING, REF) which is pretty lame but I guess it's specified by the standard so I'm not sure there's much we can do about it. The rest look reasonable and necessary AFAICT I can confirm that I added the three keywords as described in the SQL/XML standard (section 8.4). Apologies for the delayed confirmation, I missed the thread when it was started and only noticed when your commit message arrived. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On 27/02/11 19:37, David E. Wheeler wrote: On Feb 27, 2011, at 11:23 AM, Tom Lane wrote: Well, that's why I asked --- if it's going to be a huge chunk of code, then I agree this is the wrong path to pursue. However, I do feel that libxml pretty well sucks, so if we could replace it with a relatively small amount of code, that might be the right thing to do. I think that XML parsers must be hard to get really right, because of all those I've used in Perl, XML::LibXML is far and away the best. Its docs suck, but it does the work really well. No, because the xpath stuff is fundamentally broken, and nobody seems to know how to make libxslt do what we actually need. See the open bugs on the TODO list. XPath is broken? I use it heavily in the Perl module Test::XPath and now, in PostgreSQL, with my explanation extension. http://github.com/theory/explanation/ Is this something I need to worry about I don't believe that XPath is fundamentally broken, but I think Tom may have meant xslt. When reviewing a recent patch to xml2/xslt I found a few bugs in the way were using libxslt, as well as a bug in the library itself (see http://archives.postgresql.org/pgsql-hackers/2011-02/msg01878.php). However if Tom does mean that xpath is the culprit, it may be with the way the libxml2 library works. It's a very messy singleton. If I'm wrong, I'm sure I'll be corrected! Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First patch proposal
On 14/10/10 15:53, Alastair Turner wrote: It isn't a TODO item, or related to any previous thread I could find. It's certainly something I can see a use for. When I'm having a bad typing day I get annoyed that I find I've made a mistake after I've typed the password. To me this is a feature that will just make life a little more pleasant for command line junkies like me. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [HACKERS] XML schema validation
On 15/10/10 15:08, Tomáš Pospíšil wrote: Hi Robert, I would like to contribute for community. Under licence used by PostgreSQL. So is (or was) there anybody with the same interest as me? Yes, I've been looking at improving the XML support overall, or at least working to finish the implementation of the SQL 2008 XML specification. Currently I've been looking at implementing XQuery, however I have recently started to come to the conclusion that Schema validation should be implemented first as there is some overlap of functionality. What have you looked at? Do you have an outline design you could share on how you would go about adding schema validation? Regrads, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First patch proposal
On 14/10/10 15:53, Alastair Turner wrote: It isn't a TODO item, or related to any previous thread I could find. It's certainly something I can see a use for. When I'm having a bad typing day I get annoyed that I find I've made a mistake after I've typed the password. To me this is a feature that will just make life a little more pleasant for command line junkies like me. Regards, -- Mike Fowler Registered Linux user: 379787 NB: Post attmpt two, yesterday's was never delievered to hackers so apologies if Alastair and Hitoshi have seen this message once already. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: General purpose utility functions used by the JSON data type
On 13/08/10 10:45, Joseph Adams wrote: This patch doesn't include tests . How would I go about writing them? I have made the JSON data type built-in, and I will post that patch shortly (it depends on this one). The built-in JSON data type uses all of these utility functions, and the tests for the JSON data type pass. Joseph, Most existing data types have a regression SQL script in src/test/regress/sql. Using one of them as an example to draw some inspiration from, you should be able to write a 'json.sql' script that exercises the data type and it's supporting functions. Full instructions can be found at http://wiki.postgresql.org/wiki/Regression_test_authoring Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: xml_is_well_formed
On 11/08/10 21:27, Tom Lane wrote: Robert Haasrobertmh...@gmail.com writes: On Mon, Aug 9, 2010 at 10:41 AM, Robert Haasrobertmh...@gmail.com wrote: There's also the fact that it would probably end up parsing the data twice. Given xmloption, I'm inclined to think Tom has it right: provided xml_is_well_formed() that follows xmloption, plus a specific version for each of content and document. Another reasonable option here would be to forget about having xml_is_well_formed() per se and ONLY offer xml_is_well_formed_content() and xml_is_well_formed_document(). We already have xml_is_well_formed(); just dropping it doesn't seem like a helpful choice. As a project management note, this CommitFest is over in 4 days, so unless we have a new version of this patch real soon now we need to defer it to the September 15th CommitFest Yes. Mike, are you expecting to submit a new version before the end of the week? Yes and here it is, apologies for the delay. I have re-implemented xml_is_well_formed such that it is sensitive to the XMLOPTION. The additional _document and _content methods are now present. Tests and documentation adjusted to suit. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/contrib/xml2/pgxml.sql.in --- b/contrib/xml2/pgxml.sql.in *** *** 5,18 SET search_path = public; --SQL for XML parser - CREATE OR REPLACE FUNCTION xml_is_well_formed(text) RETURNS bool - AS 'MODULE_PATHNAME' - LANGUAGE C STRICT IMMUTABLE; - -- deprecated old name for xml_is_well_formed CREATE OR REPLACE FUNCTION xml_valid(text) RETURNS bool ! AS 'MODULE_PATHNAME', 'xml_is_well_formed' ! LANGUAGE C STRICT IMMUTABLE; CREATE OR REPLACE FUNCTION xml_encode_special_chars(text) RETURNS text AS 'MODULE_PATHNAME' --- 5,14 --SQL for XML parser -- deprecated old name for xml_is_well_formed CREATE OR REPLACE FUNCTION xml_valid(text) RETURNS bool ! AS 'xml_is_well_formed' ! LANGUAGE INTERNAL STRICT IMMUTABLE; CREATE OR REPLACE FUNCTION xml_encode_special_chars(text) RETURNS text AS 'MODULE_PATHNAME' *** a/contrib/xml2/uninstall_pgxml.sql --- b/contrib/xml2/uninstall_pgxml.sql *** *** 29,33 DROP FUNCTION xml_encode_special_chars(text); -- deprecated old name for xml_is_well_formed DROP FUNCTION xml_valid(text); - - DROP FUNCTION xml_is_well_formed(text); --- 29,31 *** a/contrib/xml2/xpath.c --- b/contrib/xml2/xpath.c *** *** 27,33 PG_MODULE_MAGIC; /* externally accessible functions */ - Datum xml_is_well_formed(PG_FUNCTION_ARGS); Datum xml_encode_special_chars(PG_FUNCTION_ARGS); Datum xpath_nodeset(PG_FUNCTION_ARGS); Datum xpath_string(PG_FUNCTION_ARGS); --- 27,32 *** *** 71,97 pgxml_parser_init(void) } - /* Returns true if document is well-formed */ - - PG_FUNCTION_INFO_V1(xml_is_well_formed); - - Datum - xml_is_well_formed(PG_FUNCTION_ARGS) - { - text *t = PG_GETARG_TEXT_P(0); /* document buffer */ - int32 docsize = VARSIZE(t) - VARHDRSZ; - xmlDocPtr doctree; - - pgxml_parser_init(); - - doctree = xmlParseMemory((char *) VARDATA(t), docsize); - if (doctree == NULL) - PG_RETURN_BOOL(false); /* i.e. not well-formed */ - xmlFreeDoc(doctree); - PG_RETURN_BOOL(true); - } - - /* Encodes special characters (, , , and \r) as XML entities */ PG_FUNCTION_INFO_V1(xml_encode_special_chars); --- 70,75 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8625,8630 SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownTor --- 8625,8736 supports XPath, which is a subset of XQuery. /para /sect3 + +sect3 + titlexml_is_well_formed/title + + indexterm + primaryxml_is_well_formed/primary + secondarywell formed/secondary + /indexterm + + synopsis + functionxml_is_well_formed/function(replaceabletext/replaceable) + /synopsis + + para + The function functionxml_is_well_formed/function evaluates whether + the replaceabletext/replaceable is well formed XML content, returning + a boolean. It is useful for predetermining whether a cast to the XML type + will succeed, and therefore honours the current value of the + literalXMLOPTION/literal session configuration parameter. + /para + para + Example: + screen![CDATA[ + SET xmloption TO DOCUMENT; + SELECT xml_is_well_formed(''); + xml_is_well_formed + + f + (1 row) + + SELECT xml_is_well_formed('abc/'); + xml_is_well_formed + + t + (1 row) + + SET xmloption TO CONTENT; + SELECT xml_is_well_formed('abc'); + xml_is_well_formed + + t + (1 row) + ]]/screen + /para + para + In addition to the structure checks, the function ensures that namespaces are correcty matched. + screen![CDATA[ + SET xmloption TO DOCUMENT; + SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org
Re: [HACKERS] Initial review of xslt with no limits patch
On 09/08/10 04:07, Tom Lane wrote: Robert Haasrobertmh...@gmail.com writes: On Sun, Aug 8, 2010 at 11:36 AM, Mike Fowlerm...@mlfowler.com wrote: 1) XML2 is largely undocumented, giving rise to the problems encountered. Since the module is deprecated anyways, does it make more sense to get xslt handling moved into core and get it fully documented? Yes, I think that would be better. I'm hesitant to consider pulling this into core when there's so little consensus on how it ought to act. It'd be better to have a solid, widely used contrib module *first*, rather than imagine that pulling it into core is somehow a cure for its problems. Perhaps the first step forward is to pull xslt_process out of xml2 and create a standalone xslt contrib module. Then at least it can lose the stigma of being in a deprecated module and perhaps make it more visible to users. 2) Pavel's regression test exposes a bug in libxslt. The stylesheet declares 5 parameters, but uses 12. Simplifying, take the stylesheet: I'm not sure whether there's anything we can do about this. We should file a bug report with the libxslt authors, obviously. Turns out the bug was filed in 2005 (see https://bugzilla.gnome.org/show_bug.cgi?id=307061). They are currently taking a fairly loose interpretation of the XSLT spec. However that was only one aspect of the concern. The other was that no errors were being reported back in psql when the libxslt is generating errors. Is this desirable? Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial review of xslt with no limits patch
On 06/08/10 17:50, Pavel Stehule wrote: attached updated patch with regression test Bravely ignoring the quotation/varidic/favourite_scheme_here conversations, I've taken a look at the patch as is. Thanks to Tom's input I can now correctly drive the function. I can also report that code is now behaving in the expected way. I have two other observations, more directed at the community than Pavel: 1) XML2 is largely undocumented, giving rise to the problems encountered. Since the module is deprecated anyways, does it make more sense to get xslt handling moved into core and get it fully documented? 2) Pavel's regression test exposes a bug in libxslt. The stylesheet declares 5 parameters, but uses 12. Simplifying, take the stylesheet: xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; version=1.0 xsl:output method=xml omit-xml-declaration=yes indent=yes/ xsl:strip-space elements=*/ xsl:param name=n1/ xsl:template match=* xsl:element name=samples xsl:element name=sample xsl:value-of select=$n1/ /xsl:element xsl:element name=sample xsl:value-of select=$n2/ /xsl:element /xsl:element /xsl:template /xsl:stylesheet and run the command: ~/temp$ xsltproc --stringparam n2 v2 Untitled2.xsl Untitled1.xml samples sample/ samplev2/sample /samples All looks good. However if you run: ~/temp$ xsltproc --stringparam n1 v1 Untitled2.xsl Untitled1.xml runtime error: file Untitled2.xsl line 28 element value-of Variable 'n2' has not been declared. xmlXPathCompiledEval: evaluation failed runtime error: file Untitled2.xsl line 28 element value-of XPath evaluation returned no result. samples samplev1/sample sample/ /samples The xslt_process function ignores these errors and returns cleanly. To summarize, the bug in libxslt is that it allows the processing of unnamed parameters - most other parsers would reject this stylesheet. Secondly, the xslt_process does not return the errors reported when running without passing the unnamed parameter. Personally I would like to see this documented and moved to core so that the whole of xml2 can be dropped. I also think that the errors should be reported, even if libxslt doesn't behave properly in all scenarios. Of course there's that whole other issue around how you pass the parameters in the first place that needs resolving too... Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function
On 06/08/10 20:55, Peter Eisentraut wrote: On fre, 2010-08-06 at 09:04 +0100, Mike Fowler wrote: If the patch is to be committed, does it make sense for me to refine it such that it uses the new xpath internal function you extracted in the xmlexists patch? Yes, you can probably shrink this patch down to about 20 lines. Updated the patch so that it will apply to head and re-worked the function to use the new xpath internal function. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8693,8698 SELECT xpath('//mydefns:b/text()', 'a xmlns=http://example.com;btest/b/a --- 8693,8731 (1 row) ]]/screen /para + +sect3 + titlexpath_exists/title + + indexterm + primaryxpath_exists/primary + /indexterm + + synopsis + functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional) + /synopsis + + para + The function functionxpath_exists/function is a specialised form + of the functionxpath/function function. Though the functions are + syntactically the same the xpath expressions are evaluated in differing + contexts. Instead of returning the XML values that satisfy the xpath, this + function returns a boolean indicating whether the query was satisfied or not. + /para + + para + Example: + screen![CDATA[ + SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', + ARRAY[ARRAY['my', 'http://example.com']]); + + xpath_exists + + t + (1 row) + ]]/screen + /para + /sect3 /sect2 sect2 id=functions-xml-mapping *** a/src/backend/utils/adt/xml.c --- b/src/backend/utils/adt/xml.c *** *** 3541,3543 Datum xmlexists(PG_FUNCTION_ARGS) --- 3541,3567 return 0; #endif } + + /* + * Determines if the node specified by the supplied XPath exists + * in a given XML document, returning a boolean. Differs from + * xmlexists as it supports namespaces and is not defined in SQL/XML. + */ + Datum + xpath_exists(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text *xpath_expr_text = PG_GETARG_TEXT_P(0); + xmltype*data = PG_GETARG_XML_P(1); + ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2); + int res_nitems; + + xpath_internal(xpath_expr_text, data, namespaces, + res_nitems, NULL); + + PG_RETURN_BOOL(res_nitems 0); + #else + NO_XML_SUPPORT(); + return 0; + #endif + } *** a/src/include/catalog/pg_proc.h --- b/src/include/catalog/pg_proc.h *** *** 4390,4395 DESCR(evaluate XPath expression); --- 4390,4400 DATA(insert OID = 2614 ( xmlexists PGNSP PGUID 12 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ xmlexists _null_ _null_ _null_ )); DESCR(test XML value against XPath expression); + DATA(insert OID = 3037 ( xpath_exists PGNSP PGUID 12 1 0 0 f f f t f i 3 0 16 25 142 1009 _null_ _null_ _null_ _null_ xpath_exists _null_ _null_ _null_ )); + DESCR(evaluate XPath expression in a boolean context, with namespaces support); + DATA(insert OID = 3038 ( xpath_exists PGNSP PGUID 14 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ select pg_catalog.xpath_exists($1, $2, ''{}''::pg_catalog.text[]) _null_ _null_ _null_ )); + DESCR(evaluate XPath expression in a boolean context); + /* uuid */ DATA(insert OID = 2952 ( uuid_in PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2950 2275 _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ )); DESCR(I/O); *** a/src/include/utils/xml.h --- b/src/include/utils/xml.h *** *** 37,42 extern Datum texttoxml(PG_FUNCTION_ARGS); --- 37,43 extern Datum xmltotext(PG_FUNCTION_ARGS); extern Datum xmlvalidate(PG_FUNCTION_ARGS); extern Datum xpath(PG_FUNCTION_ARGS); + extern Datum xpath_exists(PG_FUNCTION_ARGS); extern Datum xmlexists(PG_FUNCTION_ARGS); extern Datum table_to_xml(PG_FUNCTION_ARGS); *** a/src/test/regress/expected/xml.out --- b/src/test/regress/expected/xml.out *** *** 502,507 SELECT xpath('//b', 'aone btwo/b three betc/b/a'); --- 502,560 {btwo/b,betc/b} (1 row) + -- Test xpath_exists evaluation + SELECT xpath_exists('//town[text() = ''Toronto'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); + xpath_exists + -- + f + (1 row) + + SELECT xpath_exists('//town[text() = ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); + xpath_exists + -- + t + (1 row) + + INSERT INTO xmltest VALUES (4, 'menubeersnameBudvar/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml); + INSERT INTO xmltest VALUES (5, 'menubeersnameMolson/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml); + INSERT INTO xmltest VALUES (6, 'myns:menu xmlns:myns=http://myns.com;myns:beersmyns:nameBudvar
Re: [HACKERS] review: xml_is_well_formed
On 06/08/10 21:55, Peter Eisentraut wrote: On fre, 2010-08-06 at 14:43 +0100, Mike Fowler wrote: Or perhaps it could return a string instead of a boolean: content, document, or NULL if it's neither. I like the sound of that. In fact this helps workaround the IS DOCUMENT and IS CONTENT limitations such that you can you can select only content, only documents or both is you use IS NOT NULL. Unless anyone sees a reason that this function needs to remain a boolean function, I'll rework the patch over the weekend. What is the actual use case for this function? Is the above behavior actually useful? The idea is to be able to filter a table that contains XML in TEXT that might not be well formed. Knowing that you're only dealing with well formed XML prevents you blowing up when you attempt the cast. One reason to stick with boolean is backward compatibility. To be honest I'm happiest with returning a boolean, even if there is some confusion over content only being valid. Though changing the return value to DOCUMENT/CONTENT/NULL makes things a touch more explicit, the same results can be achieved by simply running: SELECT data::xml FROM mixed WHERE xml_is_well_formed(data) AND data::xml IS DOCUMENT; Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function
On 06/08/10 05:38, Peter Eisentraut wrote: On tis, 2010-07-27 at 16:33 -0700, David Fetter wrote: * Do we already have it? Not really. There are kludges to accomplish these things, but they're available mostly in the sense that a general-purpose language allows you to write code to do anything a Turing machine can do. I think this has been obsoleted by the xmlexists patch In many ways yes. The only surviving difference is that xpath_exists has support for namespaces and xmlexists does not as the grammar expects namespaces to be handled in the xquery. So if people expect namespace support to be useful that having both functions is useful until I (or someone who works faster than me) get xquery going. If the patch is to be committed, does it make sense for me to refine it such that it uses the new xpath internal function you extracted in the xmlexists patch? Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: xml_is_well_formed
On 03/08/10 16:15, Peter Eisentraut wrote: On lör, 2010-07-31 at 13:40 -0400, Robert Haas wrote: Well-formedness should probably only allow XML documents. I think the point of this function is to determine whether a cast to xml will throw an error. The behavior should probably match exactly whatever test would be applied there. Maybe there should be xml_is_well_formed() xml_is_well_formed_document() xml_is_well_formed_content() I agree that consistency with SQL/XML is desirable, but for someone coming from the outside, the unqualified claim that 'foo' is well-formed XML might sound suspicious. What about making the function sensitive to the XML OPTION, such that: test=# SET xmloption TO DOCUMENT; SET text=# SELECT xml_is_well_formed('foo'); xml_is_well_formed f (1 row) test=# SET xmloption TO CONTENT; SET text=# SELECT xml_is_well_formed('foo'); xml_is_well_formed t (1 row) with the inverse for DOCUMENTS? To me this makes the most sense as it makes the function behave much more like the other xml functions. -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: xml_is_well_formed
On 06/08/10 12:31, Robert Haas wrote: Maybe there should be xml_is_well_formed() xml_is_well_formed_document() xml_is_well_formed_content() I agree that consistency with SQL/XML is desirable, but for someone coming from the outside, the unqualified claim that 'foo' is well-formed XML might sound suspicious. What about making the function sensitive to the XML OPTION, such that: test=# SET xmloption TO DOCUMENT; SET text=# SELECT xml_is_well_formed('foo'); xml_is_well_formed f (1 row) That will make using this function a huge hassle, won't it? Functions that do different things depending on GUC settings are usually troublesome. Having three functions would be more sensible if we need all three behaviors, but I don't see why we do. Or perhaps it could return a string instead of a boolean: content, document, or NULL if it's neither. I like the sound of that. In fact this helps workaround the IS DOCUMENT and IS CONTENT limitations such that you can you can select only content, only documents or both is you use IS NOT NULL. Unless anyone sees a reason that this function needs to remain a boolean function, I'll rework the patch over the weekend. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial review of xslt with no limits patch
On 06/08/10 15:08, Andrew Dunstan wrote: On 08/06/2010 02:29 AM, Pavel Stehule wrote: 2010/8/6 David Fetterda...@fetter.org: On Fri, Aug 06, 2010 at 05:57:37AM +0200, Pavel Stehule wrote: 2010/8/6 Andrew Dunstanand...@dunslane.net: On 08/05/2010 06:56 PM, Mike Fowler wrote: SELECT xslt_process('employeenamecim/nameage30/agepay400/pay/employee'::text, $$xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; version=1.0 xsl:output method=xml omit-xml-declaration=yes indent=yes/ [snip] /xsl:stylesheet$$::text, 'n1=v1,n2=v2,n3=v3,n4=v4,n5=v5'::text) I haven't been paying attention to this, so sorry if this has been discussed before, but it just caught my eye. Why are we passing these params as a comma-separated list rather than as an array or as a variadic list of params? This looks rather ugly. What if you want to have a param that includes a comma? There is probably problem in pairs - label = value. Can be nice, if we can use a variadic functions for this, but I am afraid, ... using a variadic function isn't too much nice now some xslt_process(xmlsrc, 'n1=v1','n2=v2','n3=v3' This sounds like the perfect case for pulling hstore into core code. :) I afraid so integration of hstore can break and block work on real hash support. I would to have hash tables in core, but with usual features and usual syntax - like Perl or PHP Can we just keep this discussion within reasonable bounds? The issue is not hstore or other hashes, but how to do the param list for xslt sanely given what we have now. A variadic list will be much nicer than what is currently proposed. cheers andrew +1 Variadic seems the most sensible to me anyways. However the more urgent problem is, pending someone spotting an error in my ways, neither the existing code or the patched code appear able to evaluate the parameters. Note than in my example I supplied a default value for the fifth parameter and not even that value is appearing in the outputted XML. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial review of xslt with no limits patch
Hi Pavel, On 02/08/10 09:21, Pavel Stehule wrote: Hello 2010/8/2 Mike Fowlerm...@mlfowler.com: Hi Pavel, Currently your patch isn't applying to head, from the looks of things a function signature has changed. Can you update your patch please? yes - see attachment Thanks, the new patch applies cleanly. However I've been attempting to run the parameterised XSLT this evening but to no avail. Reverting your code I've discovered that it does not work in the old version either. Given the complete lack of documentation (not your fault) it's always possible that I'm doing something wrong. Given the query below, you should get the XML that follows, and indeed in oXygen (a standalone XML tool) you do: SELECT xslt_process('employeenamecim/nameage30/agepay400/pay/employee'::text, $$xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; version=1.0 xsl:output method=xml omit-xml-declaration=yes indent=yes/ xsl:strip-space elements=*/ xsl:param name=n1/ xsl:param name=n2/ xsl:param name=n3/ xsl:param name=n4/ xsl:param name=n5 select='me'/ xsl:template match=* xsl:element name=samples xsl:element name=sample xsl:value-of select=$n1/ /xsl:element xsl:element name=sample xsl:value-of select=$n2/ /xsl:element xsl:element name=sample xsl:value-of select=$n3/ /xsl:element xsl:element name=sample xsl:value-of select=$n4/ /xsl:element xsl:element name=sample xsl:value-of select=$n5/ /xsl:element /xsl:element /xsl:template /xsl:stylesheet$$::text, 'n1=v1,n2=v2,n3=v3,n4=v4,n5=v5'::text) samples samplev1/sample samplev2/sample samplev3/sample samplev4/sample samplev5/sample /samples Sadly I get the following in both versions: samples sample/ sample/ sample/ sample/ sample/ /samples Unless you can see anything I'm doing wrong I suggest we mark this patch either 'Returned with feedback' or 'Rejected'. Since contrib/xml2 is deprecated, perhaps a better way forward is to pull XSLT handling into core and fix both the apparent inability to handle parameters as well as the limited number of parameters. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: xml_is_well_formed
On 02/08/10 07:46, Pavel Stehule wrote: I have not any suggestions now - so I'll change flag to ready to commit sorry - contrib module should be a fixed patch attached Thanks Pavel, you saved me some time! Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Initial review of xslt with no limits patch
Hi Pavel, Currently your patch isn't applying to head, from the looks of things a function signature has changed. Can you update your patch please? Also, having had a read through the patch itself I note that there are no tests and no changes to documentation. Shouldn't the documentation advertise that the there are no limits on the numbers of parameters? A couple of tests will also help me to test your patch, Below is the results of running patch: patch -p0 ../nolimits.diff patching file ./contrib/xml2/xslt_proc.c Hunk #1 FAILED at 42. Hunk #2 succeeded at 57 (offset -2 lines). Hunk #3 succeeded at 69 (offset -2 lines). Hunk #4 succeeded at 142 (offset -4 lines). Hunk #5 succeeded at 179 (offset -4 lines). Hunk #6 succeeded at 192 with fuzz 1 (offset -4 lines). 1 out of 6 hunks FAILED -- saving rejects to file ./contrib/xml2/xslt_proc.c.rej The rejects were: *** ./contrib/xml2/xslt_proc.c.orig 2010-03-03 20:10:22.0 +0100 --- ./contrib/xml2/xslt_proc.c 2010-05-03 15:07:17.010918303 +0200 *** *** 42,50 extern void pgxml_parser_init(void); /* local defs */ ! static void parse_params(const char **params, text *paramstr); ! #define MAXPARAMS 20 /* must be even, see parse_params() */ #endif /* USE_LIBXSLT */ --- 42,51 extern void pgxml_parser_init(void); /* local defs */ ! const char **parse_params(text *paramstr); ! #define INIT_PARAMS 20/* must be even, see parse_params() */ ! #define EXTEND_PARAMS 20 /* must be even, see parse_params() */ #endif /* USE_LIBXSLT */ Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: xml_is_well_formed
Hi Pavel, Thanks for taking the time to review my patch. Attached is a new version addressing your concerns. On 29/07/10 14:21, Pavel Stehule wrote: I have a few issues: * broken regress test (fedora 13 - xmllint: using libxml version 20707) postgres=# SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo'); xml_is_well_formed f (1 row) this xml is broken - but in regress tests is ok [pa...@pavel-stehule ~]$ xmllint xxx xxx:1: parser error : error parsing attribute name pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo This is a problem that was observered recently by Thom Brown - the commit fest webapp adds the semicolon after the quote. If you look at the attachment I sent in a email client you'll find there is no semicolon. The patch attached here will also not have the semicolon. * xml_is_well_formed returns true for simple text postgres=# SELECT xml_is_well_formed(''); xml_is_well_formed t (1 row) it is probably wrong result - is it ok?? Yes this is OK, pure text is valid XML content. * I don't understand to this fragment PG_TRY(); + { + size_t count; + xmlChar*version = NULL; + int standalone = -1; +. + res_code = parse_xml_decl(string,count,version, NULL,standalone); + if (res_code != 0) + xml_ereport_by_code(ERROR, ERRCODE_INVALID_XML_CONTENT, + invalid XML content: invalid XML declaration, + res_code); +. + doc = xmlNewDoc(version); + doc-encoding = xmlStrdup((const xmlChar *) UTF-8); + doc-standalone = 1; why? This function can raise exception when declaration is wrong. It is wrong - I think, this function should returns false instead exception. You're quite right, I should be returning false here and not causing an exception. I have corrected this in the attached patch. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/contrib/xml2/xpath.c --- b/contrib/xml2/xpath.c *** *** 27,33 PG_MODULE_MAGIC; /* externally accessible functions */ - Datum xml_is_well_formed(PG_FUNCTION_ARGS); Datum xml_encode_special_chars(PG_FUNCTION_ARGS); Datum xpath_nodeset(PG_FUNCTION_ARGS); Datum xpath_string(PG_FUNCTION_ARGS); --- 27,32 *** *** 70,97 pgxml_parser_init(void) xmlLoadExtDtdDefaultValue = 1; } - - /* Returns true if document is well-formed */ - - PG_FUNCTION_INFO_V1(xml_is_well_formed); - - Datum - xml_is_well_formed(PG_FUNCTION_ARGS) - { - text *t = PG_GETARG_TEXT_P(0); /* document buffer */ - int32 docsize = VARSIZE(t) - VARHDRSZ; - xmlDocPtr doctree; - - pgxml_parser_init(); - - doctree = xmlParseMemory((char *) VARDATA(t), docsize); - if (doctree == NULL) - PG_RETURN_BOOL(false); /* i.e. not well-formed */ - xmlFreeDoc(doctree); - PG_RETURN_BOOL(true); - } - - /* Encodes special characters (, , , and \r) as XML entities */ PG_FUNCTION_INFO_V1(xml_encode_special_chars); --- 69,74 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 sect3 ! titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8566 ]]/screen /para /sect3 + /sect2 + + sect2 +titleXML Predicates/title sect3 ! titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8578,8675 between documents and content fragments. /para /sect3 + +sect3 + titlexml_is_well_formed/title + + indexterm + primaryxml_is_well_formed/primary + secondarywell formed/secondary + /indexterm + + synopsis + functionxml_is_well_formed/function(replaceabletext/replaceable) + /synopsis + + para + The function functionxml_is_well_formed/function evaluates whether + the replaceabletext/replaceable is well formed XML content, returning + a boolean. + /para + para + Example: + screen![CDATA[ + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + t + (1 row) + + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + f + (1 row) + + SELECT xml_is_well_formed('foobarstuff/foo'); + xml_is_well_formed + + f + (1 row) + ]]/screen + /para + para + In addition to the structure checks, the function ensures that namespaces are correcty matched. + screen![CDATA[ + SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff
Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar
On 21/07/10 08:33, Mike Fowler wrote: Why is the first argument AexprConst instead of a_expr? The SQL standard says it's a character string literal, but I think we can very well allow arbitrary expressions. Yes, it was AexprConst because of the specification. I also found that using it solved my shift/reduce problems, but I can change it a_expr as see if I can work them out in a different way. [snip] Why c_expr? As with the AexprConst, it's choice was partially influenced by the fact it solved the shift/reduce errors I was getting. I'm guessing than that I should really use a_expr and resolve the shift/reduce problem differently? Attached is the revised version of the patch addressing all the issues raised in the review, except for the use of AexprConst and c_expr. With my limited knowledge of bison I've failed to resolve the shift/reduce errors that are introduced by using a_expr. I'm open to suggestions as my desk is getting annoyed with me beating it in frustration! Thanks again for taking the time to review my work. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 !sect3 titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8570 ]]/screen /para /sect3 +/sect2 !sect2 titleXML Predicates/title + + indexterm + primaryXML Predicates/primary + /indexterm + +sect3 + titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8582,8619 between documents and content fragments. /para /sect3 + +sect3 + titleXMLEXISTS/title + + indexterm + primaryXMLEXISTS/primary + /indexterm + + synopsis + functionXMLEXISTS/function(replaceablexpath/replaceable optionalPASSING BY REF replaceablexml/replaceable optionalBY REF/optional/optional) + /synopsis + + para + The function functionxmlexists/function returns true if the replaceablexpath/replaceable + returns any nodes and false otherwise. If no replaceablexml/replaceable is passed, the function + will return false as a XPath cannot be evaluated without content. See the + ulink url=http://www.w3.org/TR/xpath/#section-Introduction;W3C recommendation 'XML Path Language'/ulink + for a detailed explanation of why. + /para + + para + Example: + screen![CDATA[ + SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownToronto/towntownOttawa/town/towns'); + + xmlexists + + t + (1 row) + ]]/screen + /para +/sect3 /sect2 sect2 id=functions-xml-processing *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *** *** 423,431 static TypeName *TableFuncTypeName(List *columns); %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes %type node xml_root_version opt_xml_root_standalone ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr --- 423,432 %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes xmlexists_list %type node xml_root_version opt_xml_root_standalone ! %type node xmlexists_query_argument_list ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr *** *** 511,523 static TypeName *TableFuncTypeName(List *columns); OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE --- 512,524 OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE *** *** 539,545 static TypeName *TableFuncTypeName(List *columns); WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar
Hi Peter, Thanks for your feedback. On 20/07/10 19:54, Peter Eisentraut wrote: Attached is a patch with the revised XMLEXISTS function, complete with grammar support and regression tests. The implemented grammar is: XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] ) Though the full grammar makes everything after the xpath_expression optional, I've left it has mandatory simply to avoid lots of rework of the function (would need new null checks, memory handling would need reworking). Some thoughts, mostly nitpicks: The snippet of documentation could be clearer. It says if the xml satisifies the xpath. Not sure what that means exactly. An XPath expression, by definition, returns a value. How is that value used to determine the result? I'll rephrase it: The function xmlexists returns true if the xpath returns any nodes and false otherwise. Naming of parser symbols: xmlexists_list isn't actually a list of xmlexists's. That particular rule can probably be done away with anyway and the code be put directly into the XMLEXISTS rule. Why is the first argument AexprConst instead of a_expr? The SQL standard says it's a character string literal, but I think we can very well allow arbitrary expressions. Yes, it was AexprConst because of the specification. I also found that using it solved my shift/reduce problems, but I can change it a_expr as see if I can work them out in a different way. xmlexists_query_argument_list should be optional. OK, I'll change it. The rules xml_default_passing_mechanism and xml_passing_mechanism are pretty useless to have a separate rules. Just mention the tokens where they are used. Again, I'll change that too. Why c_expr? As with the AexprConst, it's choice was partially influenced by the fact it solved the shift/reduce errors I was getting. I'm guessing than that I should really use a_expr and resolve the shift/reduce problem differently? Call the C-level function xmlexists for consistency. Sure. I'll look to get a patch addressing these concerns out in the next day or two, work/family/sleep permitting! :) Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Thom Brown wrote: Would a test for mismatched or undefined namespaces be necessary? For example: Mismatched namespace: pg:foo xmlns:pg=http://postgresql.org/stuff;bar/my:foo Undefined namespace when used in conjunction with IS DOCUMENT: pg:foo xmlns:my=http://postgresql.org/stuff;bar/pg:foo Thanks for looking at my patch Thom. I hadn't thought of that particular scenario and even though I didn't specifically code for it, the underlying libxml call does correctly reject the mismatched namespace: template1=# SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;bar/my:foo'); xml_is_well_formed f (1 row) In the attached patch I've added the example to the SGML documentation and the regression tests. Also, having a look at the following example from the patch: SELECT xml_is_well_formed('local:data xmlns:local=http://127.0.0.1;;local:piece id=1number one/local:piecelocal:piece id=2 //local:data'); xml_is_well_formed t (1 row) Just wondering about that semi-colon after the namespace definition. Thom The semi-colon is not supposed to be there, and I'm not sure where it's come from. With Thunderbird I see the email with my patch as an attachement, downloaded and viewing the file there are no instances of a followed by a ;. However, if I look at the message on the archive at http://archives.postgresql.org/message-id/4c3871c2.8000...@mlfowler.com I can see every URL that ends with a has a ; following it. Should I be escaping the in the patch file in some way or this just an artifact of HTML parsing a patch? Regards, -- Mike Fowler Registered Linux user: 379787 *** a/contrib/xml2/xpath.c --- b/contrib/xml2/xpath.c *** *** 27,33 PG_MODULE_MAGIC; /* externally accessible functions */ - Datum xml_is_well_formed(PG_FUNCTION_ARGS); Datum xml_encode_special_chars(PG_FUNCTION_ARGS); Datum xpath_nodeset(PG_FUNCTION_ARGS); Datum xpath_string(PG_FUNCTION_ARGS); --- 27,32 *** *** 70,97 pgxml_parser_init(void) xmlLoadExtDtdDefaultValue = 1; } - - /* Returns true if document is well-formed */ - - PG_FUNCTION_INFO_V1(xml_is_well_formed); - - Datum - xml_is_well_formed(PG_FUNCTION_ARGS) - { - text *t = PG_GETARG_TEXT_P(0); /* document buffer */ - int32 docsize = VARSIZE(t) - VARHDRSZ; - xmlDocPtr doctree; - - pgxml_parser_init(); - - doctree = xmlParseMemory((char *) VARDATA(t), docsize); - if (doctree == NULL) - PG_RETURN_BOOL(false); /* i.e. not well-formed */ - xmlFreeDoc(doctree); - PG_RETURN_BOOL(true); - } - - /* Encodes special characters (, , , and \r) as XML entities */ PG_FUNCTION_INFO_V1(xml_encode_special_chars); --- 69,74 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 sect3 ! titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8566 ]]/screen /para /sect3 + /sect2 + + sect2 +titleXML Predicates/title sect3 ! titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8578,8675 between documents and content fragments. /para /sect3 + +sect3 + titlexml_is_well_formed/title + + indexterm + primaryxml_is_well_formed/primary + secondarywell formed/secondary + /indexterm + + synopsis + functionxml_is_well_formed/function(replaceabletext/replaceable) + /synopsis + + para + The function functionxml_is_well_formed/function evaluates whether + the replaceabletext/replaceable is well formed XML content, returning + a boolean. + /para + para + Example: + screen![CDATA[ + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + t + (1 row) + + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + f + (1 row) + + SELECT xml_is_well_formed('foobarstuff/foo'); + xml_is_well_formed + + f + (1 row) + ]]/screen + /para + para + In addition to the structure checks, the function ensures that namespaces are correcty matched. + screen![CDATA[ + SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;bar/my:foo'); + xml_is_well_formed + + f + (1 row) + + SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;bar/pg:foo'); + xml_is_well_formed + + t + (1 row) + ]]/screen + /para + para + This function can be combined with the IS DOCUMENT predicate to prevent + invalid XML content errors from occuring in queries. For example, given a + table that may have rows with invalid XML mixed in with rows of valid
Re: [PATCH] Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Robert Haas wrote: On Fri, Jul 9, 2010 at 4:06 PM, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-07-07 at 16:37 +0100, Mike Fowler wrote: Here's the patch to add the 'xml_is_well_formed' function. I suppose we should remove the function from contrib/xml2 at the same time. Yep Revised patch deleting the contrib/xml2 version of the function attached. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/contrib/xml2/xpath.c --- b/contrib/xml2/xpath.c *** *** 27,33 PG_MODULE_MAGIC; /* externally accessible functions */ - Datum xml_is_well_formed(PG_FUNCTION_ARGS); Datum xml_encode_special_chars(PG_FUNCTION_ARGS); Datum xpath_nodeset(PG_FUNCTION_ARGS); Datum xpath_string(PG_FUNCTION_ARGS); --- 27,32 *** *** 70,97 pgxml_parser_init(void) xmlLoadExtDtdDefaultValue = 1; } - - /* Returns true if document is well-formed */ - - PG_FUNCTION_INFO_V1(xml_is_well_formed); - - Datum - xml_is_well_formed(PG_FUNCTION_ARGS) - { - text *t = PG_GETARG_TEXT_P(0); /* document buffer */ - int32 docsize = VARSIZE(t) - VARHDRSZ; - xmlDocPtr doctree; - - pgxml_parser_init(); - - doctree = xmlParseMemory((char *) VARDATA(t), docsize); - if (doctree == NULL) - PG_RETURN_BOOL(false); /* i.e. not well-formed */ - xmlFreeDoc(doctree); - PG_RETURN_BOOL(true); - } - - /* Encodes special characters (, , , and \r) as XML entities */ PG_FUNCTION_INFO_V1(xml_encode_special_chars); --- 69,74 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 sect3 ! titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8566 ]]/screen /para /sect3 + /sect2 + + sect2 +titleXML Predicates/title sect3 ! titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8578,8653 between documents and content fragments. /para /sect3 + +sect3 + titlexml_is_well_formed/title + + indexterm + primaryxml_is_well_formed/primary + secondarywell formed/secondary + /indexterm + + synopsis + functionxml_is_well_formed/function(replaceabletext/replaceable) + /synopsis + + para + The function functionxml_is_well_formed/function evaluates whether + the replaceabletext/replaceable is well formed XML content, returning + a boolean. + /para + para + Example: + screen![CDATA[ + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + t + (1 row) + + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + f + (1 row) + ]]/screen + /para + para + This function can be combined with the IS DOCUMENT predicate to prevent + invalid XML content errors from occuring in queries. For example, given a + table that may have rows with invalid XML mixed in with rows of valid + XML, functionxml_is_well_formed/function can be used to filter out all + the invalid rows. + /para + para + Example: + screen![CDATA[ + SELECT * FROM mixed; + data + -- + foobar/foo + foobar/foo + foobar/foobarfoo/bar + foobar/foobarfoo/bar + (4 rows) + + SELECT COUNT(data) FROM mixed WHERE data::xml IS DOCUMENT; + ERROR: invalid XML content + DETAIL: Entity: line 1: parser error : expected '' + foobar/foo + ^ + Entity: line 1: parser error : chunk is not well balanced + foobar/foo + ^ + + SELECT COUNT(data) FROM mixed WHERE xml_is_well_formed(data) AND data::xml IS DOCUMENT; + count + --- + 1 + (1 row) + ]]/screen + /para +/sect3 /sect2 sect2 id=functions-xml-processing *** a/src/backend/utils/adt/xml.c --- b/src/backend/utils/adt/xml.c *** *** 3293,3298 xml_xmlnodetoxmltype(xmlNodePtr cur) --- 3293,3365 } #endif + Datum + xml_is_well_formed(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text*data = PG_GETARG_TEXT_P(0); + boolresult; + int res_code; + int32len; + const xmlChar *string; + xmlParserCtxtPtr ctxt; + xmlDocPtr doc = NULL; + + len = VARSIZE(data) - VARHDRSZ; + string = xml_text2xmlChar(data); + + /* Start up libxml and its parser (no-ops if already done) */ + pg_xml_init(); + xmlInitParser(); + + ctxt = xmlNewParserCtxt(); + if (ctxt == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate parser context); + + PG_TRY(); + { + size_t count; + xmlChar*version = NULL; + int standalone = -1; + + res_code = parse_xml_decl(string, count, version, NULL, standalone); + if (res_code != 0) + xml_ereport_by_code(ERROR
[PATCH] Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Peter Eisentraut wrote: On lör, 2010-07-03 at 09:26 +0100, Mike Fowler wrote: What I will do instead is implement the xml_is_well_formed function and get a patch out in the next day or two. That sounds very useful. Here's the patch to add the 'xml_is_well_formed' function. Paraphrasing the SGML the syntax is: |xml_is_well_formed|(/text/) The function |xml_is_well_formed| evaluates whether the /text/ is well formed XML content, returning a boolean. I've done some tests (included in the patch) with tables containing a mixture of well formed documents and content and the function is happily returning the expected result. Combining with IS (NOT) DOCUMENT is working nicely for pulling out content or documents from a table of text. Unless I missed something in the original correspondence, I think this patch will solve the issue. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 sect3 ! titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8566 ]]/screen /para /sect3 + /sect2 + + sect2 +titleXML Predicates/title sect3 ! titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8578,8653 between documents and content fragments. /para /sect3 + +sect3 + titlexml_is_well_formed/title + + indexterm + primaryxml_is_well_formed/primary + secondarywell formed/secondary + /indexterm + + synopsis + functionxml_is_well_formed/function(replaceabletext/replaceable) + /synopsis + + para + The function functionxml_is_well_formed/function evaluates whether + the replaceabletext/replaceable is well formed XML content, returning + a boolean. + /para + para + Example: + screen![CDATA[ + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + t + (1 row) + + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + f + (1 row) + ]]/screen + /para + para + This function can be combined with the IS DOCUMENT predicate to prevent + invalid XML content errors from occuring in queries. For example, given a + table that may have rows with invalid XML mixed in with rows of valid + XML, functionxml_is_well_formed/function can be used to filter out all + the invalid rows. + /para + para + Example: + screen![CDATA[ + SELECT * FROM mixed; + data + -- + foobar/foo + foobar/foo + foobar/foobarfoo/bar + foobar/foobarfoo/bar + (4 rows) + + SELECT COUNT(data) FROM mixed WHERE data::xml IS DOCUMENT; + ERROR: invalid XML content + DETAIL: Entity: line 1: parser error : expected '' + foobar/foo + ^ + Entity: line 1: parser error : chunk is not well balanced + foobar/foo + ^ + + SELECT COUNT(data) FROM mixed WHERE xml_is_well_formed(data) AND data::xml IS DOCUMENT; + count + --- + 1 + (1 row) + ]]/screen + /para +/sect3 /sect2 sect2 id=functions-xml-processing *** a/src/backend/utils/adt/xml.c --- b/src/backend/utils/adt/xml.c *** *** 3293,3298 xml_xmlnodetoxmltype(xmlNodePtr cur) --- 3293,3365 } #endif + Datum + xml_is_well_formed(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text*data = PG_GETARG_TEXT_P(0); + boolresult; + int res_code; + int32len; + const xmlChar *string; + xmlParserCtxtPtr ctxt; + xmlDocPtr doc = NULL; + + len = VARSIZE(data) - VARHDRSZ; + string = xml_text2xmlChar(data); + + /* Start up libxml and its parser (no-ops if already done) */ + pg_xml_init(); + xmlInitParser(); + + ctxt = xmlNewParserCtxt(); + if (ctxt == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate parser context); + + PG_TRY(); + { + size_t count; + xmlChar*version = NULL; + int standalone = -1; + + res_code = parse_xml_decl(string, count, version, NULL, standalone); + if (res_code != 0) + xml_ereport_by_code(ERROR, ERRCODE_INVALID_XML_CONTENT, + invalid XML content: invalid XML declaration, + res_code); + + doc = xmlNewDoc(version); + doc-encoding = xmlStrdup((const xmlChar *) UTF-8); + doc-standalone = 1; + + res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0, string + count, NULL); + + result = !res_code; + } + PG_CATCH(); + { + if (doc) + xmlFreeDoc(doc); + if (ctxt) + xmlFreeParserCtxt(ctxt); + + PG_RE_THROW(); + } + PG_END_TRY(); + + if (doc) + xmlFreeDoc(doc); + if (ctxt) + xmlFreeParserCtxt(ctxt); + + return result; + #else + NO_XML_SUPPORT(); + return 0
Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Quoting Peter Eisentraut pete...@gmx.net: On fre, 2010-07-02 at 14:07 +0100, Mike Fowler wrote: So if IS CONTENT were to be implemented, to determine that you have something that is malformed But that's not what IS CONTENT does. Content still needs to be well-formed. What I was hoping to achieve was to determine that something wasn't a document and wasn't content, however as you pointed out on the bugs thread the value must be XML. My mistake was not checking that I had followed the definitions all the way back to the root. What I will do instead is implement the xml_is_well_formed function and get a patch out in the next day or two. Thank you Robert and Peter for tolerating my stumbles through the standard. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Quoting Robert Haas robertmh...@gmail.com: I think the point if IS DOCUMENT is to distinguish a document: foosome stuffbar/baz//foo from a document fragment: bar/baz/ A document is allowed only one toplevel tag. It'd be nice, I think, to have a function that tells you whether something is legal XML without throwing an error if it isn't, but I suspect that should be a separate function, rather than trying to jam it into IS DOCUMENT. http://developer.postgresql.org/pgdocs/postgres/functions-xml.html#AEN15187 I've submitted a patch to the bug report I filed yesterday that implements this. The way I read the standard (and I'm only reading a draft and I'm no expert) I don't see that it mandates that IS DOCUMENT returns false when IS CONTENT would return true. So if IS CONTENT were to be implemented, to determine that you have something that is malformed you could say: val IS NOT DOCUMENT AND val IS NOT CONTENT I think having the direct predicate support would be useful for columns of text where you know that some, though possibly not all, text values are valid XML. -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Quoting Mike Fowler m...@mlfowler.com: Should the IS DOCUMENT predicate support this? At the moment you get the following: template1=# SELECT 'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns' IS DOCUMENT; ?column? -- t (1 row) template1=# SELECT 'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns' IS DOCUMENT; ERROR: invalid XML content LINE 1: SELECT 'townstownBidford-on-Avon/towntownCwmbran/to... ^ DETAIL: Entity: line 1: parser error : expected '' ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns ^ Entity: line 1: parser error : chunk is not well balanced ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns ^ I would've hoped the second would've returned 'f' rather than failing. I've had a glance at the XML/SQL standard and I don't see anything in the detail of the predicate (8.2) that would specifically prohibit us from changing this behavior, unless the common rule 'Parsing a string as an XML value' (10.16) must always be in force. I'm no standard expert, but IMHO this would be an acceptable change to improve usability. What do others think? Right, I've answered my own question whilst sitting in the open source coding session at CHAR(10). Yes, IS DOCUMENT should return false for a non-well formed document, and indeed is coded to do such. However, the conversion to the xml type which happens before the underlying xml_is_document function is even called fails and exceptions out. I'll work on a patch to resolve this behavior such that IS DOCUMENT will give you the missing 'xml_is_well_formed' function. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] Re: [HACKERS] Adding xpath_exists function
Mike Fowler wrote: Bruce Momjian wrote: I have added this to the next commit-fest: https://commitfest.postgresql.org/action/commitfest_view?id=6 Thanks Bruce. Attached is a revised patch which changes the code slightly such that it uses an older version of the libxml library. I've added comments to the code so that we remember why we didn't use the latest function. After seeing some other posts in the last couple of days, I realised I hadn't documented the function in the SGML. I have now done so, and added a couple of tests with XML literals. Please find the patch attached. Now time to go correct the xmlexists patch too... -- Mike Fowler Registered Linux user: 379787 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8626,8631 SELECT xpath('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', --- 8626,8664 (1 row) ]]/screen /para + +sect3 + titlexpath_exists/title + + indexterm + primaryxpath_exists/primary + /indexterm + + synopsis + functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional) + /synopsis + + para + The function functionxpath_exists/function is a specialised form + of the functionxpath/function function. Though the functions are + syntactically the same the xpath expressions are evaluated in differing + contexts. Instead of returning the XML values that satisfy the xpath, this + function returns a boolean indicating whether the query was satisfied or not. + /para + + para + Example: + screen![CDATA[ + SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', + ARRAY[ARRAY['my', 'http://example.com']]); + + xpath_exists + + t + (1 row) + ]]/screen + /para +/sect3 /sect2 sect2 id=functions-xml-mapping *** a/src/backend/utils/adt/xml.c --- b/src/backend/utils/adt/xml.c *** *** 3495,3497 xpath(PG_FUNCTION_ARGS) --- 3495,3681 return 0; #endif } + + /* + * Determines if the node specified by the supplied XPath exists + * in a given XML document, returning a boolean. + * + * It is up to the user to ensure that the XML passed is in fact + * an XML document - XPath doesn't work easily on fragments without + * a context node being known. + */ + Datum + xpath_exists(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text *xpath_expr_text = PG_GETARG_TEXT_P(0); + xmltype*data = PG_GETARG_XML_P(1); + ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2); + ArrayBuildState *astate = NULL; + xmlParserCtxtPtr ctxt = NULL; + xmlDocPtr doc = NULL; + xmlXPathContextPtr xpathctx = NULL; + xmlXPathCompExprPtr xpathcomp = NULL; + xmlXPathObjectPtr xpathobj = NULL; + char *datastr; + int32 len; + int32 xpath_len; + xmlChar*string; + xmlChar*xpath_expr; + int i; + int res_nitems; + int ndim; + Datum *ns_names_uris; + bool *ns_names_uris_nulls; + int ns_count; + int result; + + /* + * Namespace mappings are passed as text[]. If an empty array is passed + * (ndim = 0, 0-dimensional), then there are no namespace mappings. + * Else, a 2-dimensional array with length of the second axis being equal + * to 2 should be passed, i.e., every subarray contains 2 elements, the + * first element defining the name, the second one the URI. Example: + * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2', + * 'http://example2.com']]. + */ + ndim = ARR_NDIM(namespaces); + if (ndim != 0) + { + int *dims; + + dims = ARR_DIMS(namespaces); + + if (ndim != 2 || dims[1] != 2) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(invalid array for XML namespace mapping), + errdetail(The array must be two-dimensional with length of the second axis equal to 2.))); + + Assert(ARR_ELEMTYPE(namespaces) == TEXTOID); + + deconstruct_array(namespaces, TEXTOID, -1, false, 'i', + ns_names_uris, ns_names_uris_nulls, + ns_count); + + Assert((ns_count % 2) == 0); /* checked above */ + ns_count /= 2; /* count pairs only */ + } + else + { + ns_names_uris = NULL; + ns_names_uris_nulls = NULL; + ns_count = 0; + } + + datastr = VARDATA(data); + len = VARSIZE(data) - VARHDRSZ; + xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ; + if (xpath_len == 0) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(empty XPath expression))); + + string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar)); + memcpy(string, datastr, len); + string[len] = '\0'; + + xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar)); + memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len); + xpath_expr[xpath_len] = '\0'; + + pg_xml_init(); + xmlInitParser(); + + PG_TRY(); + { + /* + * redundant XML parsing (two
Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar
Mike Fowler wrote: Thanks again for your help Robert, turns out the fault was in the pg_proc entry (the 3 up there should've been a two!). Once I took the grammar out it was quickly obvious where I'd gone wrong. Attached is a patch with the revised XMLEXISTS function, complete with grammar support and regression tests. The implemented grammar is: XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] ) Though the full grammar makes everything after the xpath_expression optional, I've left it has mandatory simply to avoid lots of rework of the function (would need new null checks, memory handling would need reworking). As with the xpath_exists patch I've now added the SGML documentation detailing this function and extended the regression test a little to test XML literals. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 !sect3 titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8570 ]]/screen /para /sect3 +/sect2 !sect2 titleXML Predicates/title + + indexterm + primaryXML Predicates/primary + /indexterm + +sect3 + titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8582,8616 between documents and content fragments. /para /sect3 + +sect3 + titleXMLEXISTS/title + + indexterm + primaryXMLEXISTS/primary + /indexterm + + synopsis + functionXMLEXISTS/function(replaceablexpath/replaceable PASSING BY REF replaceablexml/replaceable optionalBY REF/optional) + /synopsis + + para + The function functionxmlexists/function returns true if the replaceablexml/replaceable + satisfies the replaceablexpath/replaceable and false otherwise. + /para + + para + Example: + screen![CDATA[ + SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownToronto/towntownOttawa/town/towns'); + + xmlexists + + t + (1 row) + ]]/screen + /para +/sect3 /sect2 sect2 id=functions-xml-processing *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *** *** 423,431 static TypeName *TableFuncTypeName(List *columns); %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes %type node xml_root_version opt_xml_root_standalone ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr --- 423,432 %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes xmlexists_list %type node xml_root_version opt_xml_root_standalone ! %type node xmlexists_query_argument_list xml_default_passing_mechanism xml_passing_mechanism ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr *** *** 511,523 static TypeName *TableFuncTypeName(List *columns); OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE --- 512,524 OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE *** *** 539,545 static TypeName *TableFuncTypeName(List *columns); WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE ! XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLFOREST XMLPARSE XMLPI XMLROOT XMLSERIALIZE YEAR_P YES_P --- 540,546 WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE ! XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLPARSE XMLPI XMLROOT XMLSERIALIZE YEAR_P YES_P *** *** 9806,9811 func_expr: func_name '(' ')' over_clause --- 9807,9828 { $$ = makeXmlExpr(IS_XMLELEMENT, $4, $6, $8
Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Robert Haas wrote: On Mon, Jun 28, 2010 at 11:42 AM, Mike Rylander mrylan...@gmail.com wrote: You could do something like this (untested): CREATE OR REPLACE FUNCTION my_xml_is_valid ( x TEXT ) RETURNS BOOL AS $$ BEGIN PERFORM XMLPARSE( DOCUMENT x::XML ); RETURN TRUE; EXCEPTION WHEN OTHERS THEN RETURN FALSE; END; $$ LANGUAGE PLPGSQL; This might perform significantly worse, though: exception handling ain't cheap. It's not a bad workaround, but I think the OP has a point. Should the IS DOCUMENT predicate support this? At the moment you get the following: template1=# SELECT 'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns' IS DOCUMENT; ?column? -- t (1 row) template1=# SELECT 'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns' IS DOCUMENT; ERROR: invalid XML content LINE 1: SELECT 'townstownBidford-on-Avon/towntownCwmbran/to... ^ DETAIL: Entity: line 1: parser error : expected '' ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns ^ Entity: line 1: parser error : chunk is not well balanced ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns ^ I would've hoped the second would've returned 'f' rather than failing. I've had a glance at the XML/SQL standard and I don't see anything in the detail of the predicate (8.2) that would specifically prohibit us from changing this behavior, unless the common rule 'Parsing a string as an XML value' (10.16) must always be in force. I'm no standard expert, but IMHO this would be an acceptable change to improve usability. What do others think? Regards, -- Mike Fowler Registered Linux user: 379787 I could be a genius if I just put my mind to it, and I, I could do anything, if only I could get 'round to it -PULP 'Glory Days' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar
and finally in pg_proc.h I have: DATA(insert OID = 3037 ( xmlexists PGNSP PGUID 12 1 0 0 f f f t f i 3 0 16 25 142 _null_ _null_ _null_ _null_ xml_exists _null_ _null_ _null_ )); DESCR(evaluate XPath expression in a boolean context); It looks like the pg_proc entry is creating an SQL function called xmlexists referencing a C function called xml_exists, and the gram.y changes want there to be an SQL function called xml_exists. I think you should rip out all the catalog and parser changes for starters, and just try to get it working as a regular old function. Once you have that working, you can add the syntax support back in. I'd suggest making the C and SQL function names the same as each other, but different from the keyword you're planning to use (xmlexists). Thanks again for your help Robert, turns out the fault was in the pg_proc entry (the 3 up there should've been a two!). Once I took the grammar out it was quickly obvious where I'd gone wrong. Attached is a patch with the revised XMLEXISTS function, complete with grammar support and regression tests. The implemented grammar is: XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] ) Though the full grammar makes everything after the xpath_expression optional, I've left it has mandatory simply to avoid lots of rework of the function (would need new null checks, memory handling would need reworking). -- Mike Fowler Registered Linux user: 379787 *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *** *** 423,431 static TypeName *TableFuncTypeName(List *columns); %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes %type node xml_root_version opt_xml_root_standalone ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr --- 423,432 %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes xmlexists_list %type node xml_root_version opt_xml_root_standalone ! %type node xmlexists_query_argument_list xml_default_passing_mechanism xml_passing_mechanism ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr *** *** 511,523 static TypeName *TableFuncTypeName(List *columns); OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE --- 512,524 OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE *** *** 539,545 static TypeName *TableFuncTypeName(List *columns); WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE ! XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLFOREST XMLPARSE XMLPI XMLROOT XMLSERIALIZE YEAR_P YES_P --- 540,546 WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE ! XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLPARSE XMLPI XMLROOT XMLSERIALIZE YEAR_P YES_P *** *** 9806,9811 func_expr: func_name '(' ')' over_clause --- 9807,9828 { $$ = makeXmlExpr(IS_XMLELEMENT, $4, $6, $8, @1); } + | XMLEXISTS '(' xmlexists_list ')' + { + /* xmlexists(A [PASSING BY REF B [BY REF]]) is converted to + * xmlexists(A, B)*/ + + FuncCall *n = makeNode(FuncCall); + n-funcname = SystemFuncName(xmlexists); + n-args = $3; + n-agg_order = NIL; + n-agg_star = FALSE; + n-agg_distinct = FALSE; + n-func_variadic = FALSE; + n-over = NULL; + n-location = @1; + $$ = (Node *)n; + } | XMLFOREST '(' xml_attribute_list ')' { $$ = makeXmlExpr(IS_XMLFOREST, NULL, $3, NIL, @1); *** *** 9896,9901 xml_whitespace_option: PRESERVE WHITESPACE_P { $$ = TRUE; } --- 9913,9946 | /*EMPTY*/{ $$ = FALSE; } ; + xmlexists_list: + AexprConst xmlexists_query_argument_list + { + $$ = list_make2(makeTypeCast($1,SystemTypeName(text), -1), $2); + } + ; + + xmlexists_query_argument_list
[PATCH] Re: [HACKERS] Adding xpath_exists function
Bruce Momjian wrote: I have added this to the next commit-fest: https://commitfest.postgresql.org/action/commitfest_view?id=6 Thanks Bruce. Attached is a revised patch which changes the code slightly such that it uses an older version of the libxml library. I've added comments to the code so that we remember why we didn't use the latest function. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/src/backend/utils/adt/xml.c --- b/src/backend/utils/adt/xml.c *** *** 3495,3497 xpath(PG_FUNCTION_ARGS) --- 3495,3681 return 0; #endif } + + /* + * Determines if the node specified by the supplied XPath exists + * in a given XML document, returning a boolean. + * + * It is up to the user to ensure that the XML passed is in fact + * an XML document - XPath doesn't work easily on fragments without + * a context node being known. + */ + Datum + xpath_exists(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text *xpath_expr_text = PG_GETARG_TEXT_P(0); + xmltype*data = PG_GETARG_XML_P(1); + ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2); + ArrayBuildState *astate = NULL; + xmlParserCtxtPtr ctxt = NULL; + xmlDocPtr doc = NULL; + xmlXPathContextPtr xpathctx = NULL; + xmlXPathCompExprPtr xpathcomp = NULL; + xmlXPathObjectPtr xpathobj = NULL; + char *datastr; + int32 len; + int32 xpath_len; + xmlChar*string; + xmlChar*xpath_expr; + int i; + int res_nitems; + int ndim; + Datum *ns_names_uris; + bool *ns_names_uris_nulls; + int ns_count; + int result; + + /* + * Namespace mappings are passed as text[]. If an empty array is passed + * (ndim = 0, 0-dimensional), then there are no namespace mappings. + * Else, a 2-dimensional array with length of the second axis being equal + * to 2 should be passed, i.e., every subarray contains 2 elements, the + * first element defining the name, the second one the URI. Example: + * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2', + * 'http://example2.com']]. + */ + ndim = ARR_NDIM(namespaces); + if (ndim != 0) + { + int *dims; + + dims = ARR_DIMS(namespaces); + + if (ndim != 2 || dims[1] != 2) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(invalid array for XML namespace mapping), + errdetail(The array must be two-dimensional with length of the second axis equal to 2.))); + + Assert(ARR_ELEMTYPE(namespaces) == TEXTOID); + + deconstruct_array(namespaces, TEXTOID, -1, false, 'i', + ns_names_uris, ns_names_uris_nulls, + ns_count); + + Assert((ns_count % 2) == 0); /* checked above */ + ns_count /= 2; /* count pairs only */ + } + else + { + ns_names_uris = NULL; + ns_names_uris_nulls = NULL; + ns_count = 0; + } + + datastr = VARDATA(data); + len = VARSIZE(data) - VARHDRSZ; + xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ; + if (xpath_len == 0) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(empty XPath expression))); + + string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar)); + memcpy(string, datastr, len); + string[len] = '\0'; + + xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar)); + memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len); + xpath_expr[xpath_len] = '\0'; + + pg_xml_init(); + xmlInitParser(); + + PG_TRY(); + { + /* + * redundant XML parsing (two parsings for the same value during one + * command execution are possible) + */ + ctxt = xmlNewParserCtxt(); + if (ctxt == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate parser context); + doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0); + if (doc == NULL) + xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT, + could not parse XML document); + xpathctx = xmlXPathNewContext(doc); + if (xpathctx == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate XPath context); + xpathctx-node = xmlDocGetRootElement(doc); + if (xpathctx-node == NULL) + xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR, + could not find root XML element); + + /* register namespaces, if any */ + if (ns_count 0) + { + for (i = 0; i ns_count; i++) + { + char *ns_name; + char *ns_uri; + + if (ns_names_uris_nulls[i * 2] || + ns_names_uris_nulls[i * 2 + 1]) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg(neither namespace name nor URI may be null))); + ns_name = TextDatumGetCString(ns_names_uris[i * 2]); + ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]); + if (xmlXPathRegisterNs(xpathctx, + (xmlChar *) ns_name, + (xmlChar *) ns_uri) != 0) + ereport(ERROR, /* is this an internal error??? */ + (errmsg(could not register XML namespace with name \%s\ and URI \%s\, + ns_name, ns_uri))); + } + } + + xpathcomp
Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar
Robert Haas wrote: On Sun, Jun 27, 2010 at 12:04 PM, Mike Fowler m...@mlfowler.com wrote: Thanks again for your help Robert, turns out the fault was in the pg_proc entry (the 3 up there should've been a two!). Once I took the grammar out it was quickly obvious where I'd gone wrong. Glad it was a helpful suggestion. Attached is a patch with the revised XMLEXISTS function, complete with grammar support and regression tests. The implemented grammar is: XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] ) Though the full grammar makes everything after the xpath_expression optional, I've left it has mandatory simply to avoid lots of rework of the function (would need new null checks, memory handling would need reworking). So if you don't specify the xml_value, what does the xpath_expression get applied to? From what I can gather the xpath_expression would be evalutated against an empty document thereby returning false for every xpath_expression except for 'true()'. Apache Derby has made the xml_value mandatory as well (though I'll stress my conclusion wasn't based on this fact). If you think it would better to adhere more closely to the standard I can certainly look to do so. From a cursory glance at libxml's API I think it should be straight forward to query against an empty document such that I wouldn't need ot code for the exceptional case (or cases if I've missed others). Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding XMLEXISTS to the grammar
Robert Haas wrote: I usually troubleshoot things like this by setting a breakpoint in elog_start or elog_finish. Then you can see where it's blowing up. Off the top of my head, I would guess you've added a node type whose structure definition doesn't begin with NodeTag, or else you've got a memory clobber. Thanks Robert, I've managed to resolve this make making a type cast inside gram.y. However, it now seems that the function itself can not be found. I've made an entry in pg_proc.h, but when running psql I'm getting the following: xmltest=# SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers' PASSING BY REF data); ERROR: function pg_catalog.xml_exists(text, xml) does not exist LINE 1: SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers' ... ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. In gram.y I've got: FuncCall *n = makeNode(FuncCall); n-funcname = SystemFuncName(xml_exists); (also tried SystemFuncName(xmlexists);) In xml.h: extern bool xml_exists(text *xpath_expr_text, xmltype *data); I've also tried bool xml_exists(PG_FUNCTION_ARGS) { and finally in pg_proc.h I have: DATA(insert OID = 3037 ( xmlexists PGNSP PGUID 12 1 0 0 f f f t f i 3 0 16 25 142 _null_ _null_ _null_ _null_ xml_exists _null_ _null_ _null_ )); DESCR(evaluate XPath expression in a boolean context); (also tried ( xml_exists PGNSP)) After each attempt, I've blown away the installation, made clean and installed, initialised a fresh database and restored my sample database. I've had a grep around using position and it's target function textpos as examples but I fail to see any other file that they live in other than their implementation. As far as I can tell, I'm not doing anything different from position. Any thoughts? Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding XMLEXISTS to the grammar
Look at how the POSITION() pseudofunction is defined around gram.y line 9651. Essentially any special syntax of this type gets converted to a regular function call internally. So in your case I think there will be some function that gets called something ike this: xmlexists(xpath_expression, xml_expression) ...but the grammar can be modified to allow a different syntax for that function call. I've finally managed to get gram.y to parse the syntax correctly. After progressing from a segmentation fault that occured when the grammar was correct I'm now left with a cryptic error: xmltest=# SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers' PASSING BY REF data); ERROR: unrecognized node type: 1852140847 At a guess there is another step that I need to do after modifying gram.y. One mailing list posting I found mentioned copyfuncs.c but really I'm unsure as to what next. Anyone know what the missing step is? Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Adding XMLEXISTS to the grammar
Hi, I've been working to improve the syntax of the XMLEXISTS function that I put a patch forward for and have been attempting to get my head around how you modify the grammar. I admit I'm not getting much anywhere probably as I don't know bison but I'm starting to wonder if it's worth the pain given recent comments on this list about not changing the grammar for JSON support. At this point I can see a way of implementing the following abridged syntax (abridged as I'm not doing full XQuery support at this stage) in a conventional plain function call by handling the PG_FUNCTION_ARGS approriately, but would this acceptable? XMLEXISTS ( xpath_expression [ PASSING BY REF xml_expression [BY REF] ] ) In case it isn't, and indeed to help me with the XML schema validation work I'm doing, I would still like some help on how the grammar works. From what I've greped and seen in the comments you need to modify the following files: - src/backend/parser/gram.y - src/backend/parser/parse_expr.c - src/backend/utils/ruleutils.c - src/include/parser/kwlist.h From what I can tell, you add the keywords to the lists in gram.y and kwlist.h. At the appropriate place in gram.y you define the syntax and pull out what you need and stuff it into a node (in my case using the makeXmlExpr). You then modify parse_expr.c and ruleutils.c to handle the new values in the fields of the XmlExpr node. Assuming I'm right so far, the step I'm failing to figure out is where the actual c function that implements the function gets called/associated within the grammar. What am I missing? Thanks in advance, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard
Peter Eisentraut wrote: On ons, 2010-05-26 at 11:47 +0100, Mike Fowler wrote: The XMLEXISTS function works with XQuery expressions and doesn't have the call signature that your patch implements Looking at the manuals of Oracle, Derby and DB2 I see how the call signature differs. I also note that Oracle's implementation is XPath only, Derby's is partial XQuery and DB2 appears to be full XQuery. What do people prefer me to do? I see the options as: 1) Change the call signature to match the standard 2) Change the function name back to xpath_exists It would be nice to make XMLEXISTS work as in the standard, seeing how many others are providing the same interface. Should option one be the more popular there's further choices: 1) Integrate XQuery support to completely match the standard, however this will require the addition of a new library libxquery 2) Leave the XPath as is, inline with Oracle's implementation 3) Hybrid approach. Since XML is a comple time option, add XQuery as another. Conditional completion gives the full XQuery support when available or just the XPath when not I think providing XPath is enough, at least for now Agreed. I'll get another patch together in the next day or two. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Tue, May 25, 2010 at 1:09 PM, Mike Fowler m...@mlfowler.com wrote: We're unlikely to accept this patch if it changes the minimum version of libxml2 required to compile PostgreSQL Why? 2.6.27 is almost 4 years old. At a minimum, I think it's fair to say that the burden is on you to justify what it's worth bumping the version number. Yes. Increasing the minimum required version of some library is a Big Deal, we don't do it on a whim. And we definitely don't do it just because it's old. regards, tom lane OK, I consider myself suitably educated/chastised. I now understand why a version bump is such a big deal. Your objections are all reasonable, I suppose I'm just used to living on the bleeding edge of everything. Consequently I have changed the code to produce the same result in a different way without using the new function. I've down-graded my version to 2.6.26 and it all compiles cleanly. Please find attached my revised patch, and thanks all for your advise. Regards, -- Mike Fowler Registered Linux user: 379787 Index: src/backend/utils/adt/xml.c === RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v retrieving revision 1.97 diff -c -r1.97 xml.c *** src/backend/utils/adt/xml.c 3 Mar 2010 17:29:45 - 1.97 --- src/backend/utils/adt/xml.c 26 May 2010 09:36:50 - *** *** 3495,3497 --- 3495,3678 return 0; #endif } + + /* + * Determines if the node specified by the supplied XPath exists + * in a given XML document, returning a boolean. + * + * It is up to the user to ensure that the XML passed is in fact + * an XML document - XPath doesn't work easily on fragments without + * a context node being known. + */ + Datum + xmlexists(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text *xpath_expr_text = PG_GETARG_TEXT_P(0); + xmltype*data = PG_GETARG_XML_P(1); + ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2); + xmlParserCtxtPtr ctxt = NULL; + xmlDocPtr doc = NULL; + xmlXPathContextPtr xpathctx = NULL; + xmlXPathCompExprPtr xpathcomp = NULL; + xmlXPathObjectPtr xpathobj = NULL; + char *datastr; + int32 len; + int32 xpath_len; + xmlChar*string; + xmlChar*xpath_expr; + int i; + int ndim; + Datum *ns_names_uris; + bool *ns_names_uris_nulls; + int ns_count; + int result; + + /* + * Namespace mappings are passed as text[]. If an empty array is passed + * (ndim = 0, 0-dimensional), then there are no namespace mappings. + * Else, a 2-dimensional array with length of the second axis being equal + * to 2 should be passed, i.e., every subarray contains 2 elements, the + * first element defining the name, the second one the URI. Example: + * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2', + * 'http://example2.com']]. + */ + ndim = ARR_NDIM(namespaces); + if (ndim != 0) + { + int *dims; + + dims = ARR_DIMS(namespaces); + + if (ndim != 2 || dims[1] != 2) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(invalid array for XML namespace mapping), + errdetail(The array must be two-dimensional with length of the second axis equal to 2.))); + + Assert(ARR_ELEMTYPE(namespaces) == TEXTOID); + + deconstruct_array(namespaces, TEXTOID, -1, false, 'i', + ns_names_uris, ns_names_uris_nulls, + ns_count); + + Assert((ns_count % 2) == 0); /* checked above */ + ns_count /= 2; /* count pairs only */ + } + else + { + ns_names_uris = NULL; + ns_names_uris_nulls = NULL; + ns_count = 0; + } + + datastr = VARDATA(data); + len = VARSIZE(data) - VARHDRSZ; + xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ; + if (xpath_len == 0) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(empty XPath expression))); + + string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar)); + memcpy(string, datastr, len); + string[len] = '\0'; + + xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar)); + memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len); + xpath_expr[xpath_len] = '\0'; + + pg_xml_init(); + xmlInitParser(); + + PG_TRY(); + { + /* + * redundant XML parsing (two parsings for the same value during one + * command execution are possible) + */ + ctxt = xmlNewParserCtxt(); + if (ctxt == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate parser context); + doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0); + if (doc == NULL) + xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT, + could not parse XML document); + xpathctx = xmlXPathNewContext(doc); + if (xpathctx == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate XPath context); + xpathctx-node = xmlDocGetRootElement(doc
Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard
Peter Eisentraut wrote: On tis, 2010-05-25 at 15:31 +0100, Mike Fowler wrote: I've been reading the SQL/XML standard and discovered that it defines a function named XMLEXISTS that does exactly what the todo item xpath_exists defines. My original patch named the function as per the todo but I think using the function name from the standard is a better idea. So this patch is the same as before, but the function is now named XMLEXISTS instead of xpath_exists. The XMLEXISTS function works with XQuery expressions and doesn't have the call signature that your patch implements Looking at the manuals of Oracle, Derby and DB2 I see how the call signature differs. I also note that Oracle's implementation is XPath only, Derby's is partial XQuery and DB2 appears to be full XQuery. What do people prefer me to do? I see the options as: 1) Change the call signature to match the standard 2) Change the function name back to xpath_exists Should option one be the more popular there's further choices: 1) Integrate XQuery support to completely match the standard, however this will require the addition of a new library libxquery 2) Leave the XPath as is, inline with Oracle's implementation 3) Hybrid approach. Since XML is a comple time option, add XQuery as another. Conditional completion gives the full XQuery support when available or just the XPath when not Thoughts? -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard (was: Add xpath_exists Function)
I've been reading the SQL/XML standard and discovered that it defines a function named XMLEXISTS that does exactly what the todo item xpath_exists defines. My original patch named the function as per the todo but I think using the function name from the standard is a better idea. So this patch is the same as before, but the function is now named XMLEXISTS instead of xpath_exists. Regards, -- Mike Fowler Registered Linux user: 379787 Index: src/backend/utils/adt/xml.c === RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v retrieving revision 1.97 diff -c -r1.97 xml.c *** src/backend/utils/adt/xml.c 3 Mar 2010 17:29:45 - 1.97 --- src/backend/utils/adt/xml.c 25 May 2010 14:02:33 - *** *** 3495,3497 --- 3495,3668 return 0; #endif } + + /* + * Determines if the node specified by the supplied XPath exists + * in a given XML document, returning a boolean. + * + * It is up to the user to ensure that the XML passed is in fact + * an XML document - XPath doesn't work easily on fragments without + * a context node being known. + */ + Datum + xmlexists(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text *xpath_expr_text = PG_GETARG_TEXT_P(0); + xmltype*data = PG_GETARG_XML_P(1); + ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2); + xmlParserCtxtPtr ctxt = NULL; + xmlDocPtr doc = NULL; + xmlXPathContextPtr xpathctx = NULL; + xmlXPathCompExprPtr xpathcomp = NULL; + char *datastr; + int32 len; + int32 xpath_len; + xmlChar*string; + xmlChar*xpath_expr; + int i; + int ndim; + Datum *ns_names_uris; + bool *ns_names_uris_nulls; + int ns_count; + int result; + + /* + * Namespace mappings are passed as text[]. If an empty array is passed + * (ndim = 0, 0-dimensional), then there are no namespace mappings. + * Else, a 2-dimensional array with length of the second axis being equal + * to 2 should be passed, i.e., every subarray contains 2 elements, the + * first element defining the name, the second one the URI. Example: + * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2', + * 'http://example2.com']]. + */ + ndim = ARR_NDIM(namespaces); + if (ndim != 0) + { + int *dims; + + dims = ARR_DIMS(namespaces); + + if (ndim != 2 || dims[1] != 2) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(invalid array for XML namespace mapping), + errdetail(The array must be two-dimensional with length of the second axis equal to 2.))); + + Assert(ARR_ELEMTYPE(namespaces) == TEXTOID); + + deconstruct_array(namespaces, TEXTOID, -1, false, 'i', + ns_names_uris, ns_names_uris_nulls, + ns_count); + + Assert((ns_count % 2) == 0); /* checked above */ + ns_count /= 2; /* count pairs only */ + } + else + { + ns_names_uris = NULL; + ns_names_uris_nulls = NULL; + ns_count = 0; + } + + datastr = VARDATA(data); + len = VARSIZE(data) - VARHDRSZ; + xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ; + if (xpath_len == 0) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(empty XPath expression))); + + string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar)); + memcpy(string, datastr, len); + string[len] = '\0'; + + xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar)); + memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len); + xpath_expr[xpath_len] = '\0'; + + pg_xml_init(); + xmlInitParser(); + + PG_TRY(); + { + /* + * redundant XML parsing (two parsings for the same value during one + * command execution are possible) + */ + ctxt = xmlNewParserCtxt(); + if (ctxt == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate parser context); + doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0); + if (doc == NULL) + xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT, + could not parse XML document); + xpathctx = xmlXPathNewContext(doc); + if (xpathctx == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate XPath context); + xpathctx-node = xmlDocGetRootElement(doc); + if (xpathctx-node == NULL) + xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR, + could not find root XML element); + + /* register namespaces, if any */ + if (ns_count 0) + { + for (i = 0; i ns_count; i++) + { + char *ns_name; + char *ns_uri; + + if (ns_names_uris_nulls[i * 2] || + ns_names_uris_nulls[i * 2 + 1]) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg(neither namespace name nor URI may be null))); + ns_name = TextDatumGetCString(ns_names_uris[i * 2]); + ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]); + if (xmlXPathRegisterNs(xpathctx, + (xmlChar *) ns_name, + (xmlChar *) ns_uri) != 0) + ereport(ERROR
Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard
Erik Rijkers wrote: libxml2.x86_64 2.6.26-2.1.2.8 installed libxml2-devel.x86_642.6.26-2.1.2.8 installed Thanks for testing my patch Erik. It turns out I've got libxml2 installed at version 2.7.5. Searching the gnome mailing lists, it turns out xmlXPathCompiledEvalToBoolean was added (unbelievably) in the very next version from yours, 2.6.27 (see: http://mail.gnome.org/archives/xml/2006-October/msg00119.html). Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard
Robert Haas wrote: On Tue, May 25, 2010 at 12:04 PM, Mike Fowler m...@mlfowler.com wrote: Erik Rijkers wrote: libxml2.x86_64 2.6.26-2.1.2.8 installed libxml2-devel.x86_642.6.26-2.1.2.8 installed Thanks for testing my patch Erik. It turns out I've got libxml2 installed at version 2.7.5. Searching the gnome mailing lists, it turns out xmlXPathCompiledEvalToBoolean was added (unbelievably) in the very next version from yours, 2.6.27 (see: http://mail.gnome.org/archives/xml/2006-October/msg00119.html). We're unlikely to accept this patch if it changes the minimum version of libxml2 required to compile PostgreSQL Why? 2.6.27 is almost 4 years old. I realise that my patch didn't update configure and configure.in, and indeed I didn't think of it when I responded to Erik (I'm too used to the Java world where people manage their own dependencies). I've now attached the updated patch which ups the check from version 2.6.23 to 2.6.27. Regards, -- Mike Fowler Registered Linux user: 379787 Index: configure === RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/configure,v retrieving revision 1.679 diff -c -r1.679 configure *** configure 13 May 2010 22:07:40 - 1.679 --- configure 25 May 2010 16:57:49 - *** *** 9079,9087 if test $with_libxml = yes ; then ! { $as_echo $as_me:$LINENO: checking for xmlSaveToBuffer in -lxml2 5 ! $as_echo_n checking for xmlSaveToBuffer in -lxml2... 6; } ! if test ${ac_cv_lib_xml2_xmlSaveToBuffer+set} = set; then $as_echo_n (cached) 6 else ac_check_lib_save_LIBS=$LIBS --- 9079,9087 if test $with_libxml = yes ; then ! { $as_echo $as_me:$LINENO: checking for xmlXPathCompiledEvalToBoolean in -lxml2 5 ! $as_echo_n checking for xmlXPathCompiledEvalToBoolean in -lxml2... 6; } ! if test ${ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean+set} = set; then $as_echo_n (cached) 6 else ac_check_lib_save_LIBS=$LIBS *** *** 9099,9109 #ifdef __cplusplus extern C #endif ! char xmlSaveToBuffer (); int main () { ! return xmlSaveToBuffer (); ; return 0; } --- 9099,9109 #ifdef __cplusplus extern C #endif ! char xmlXPathCompiledEvalToBoolean (); int main () { ! return xmlXPathCompiledEvalToBoolean (); ; return 0; } *** *** 9129,9140 test $cross_compiling = yes || $as_test_x conftest$ac_exeext }; then ! ac_cv_lib_xml2_xmlSaveToBuffer=yes else $as_echo $as_me: failed program was: 5 sed 's/^/| /' conftest.$ac_ext 5 ! ac_cv_lib_xml2_xmlSaveToBuffer=no fi rm -rf conftest.dSYM --- 9129,9140 test $cross_compiling = yes || $as_test_x conftest$ac_exeext }; then ! ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean=yes else $as_echo $as_me: failed program was: 5 sed 's/^/| /' conftest.$ac_ext 5 ! ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean=no fi rm -rf conftest.dSYM *** *** 9142,9150 conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi ! { $as_echo $as_me:$LINENO: result: $ac_cv_lib_xml2_xmlSaveToBuffer 5 ! $as_echo $ac_cv_lib_xml2_xmlSaveToBuffer 6; } ! if test x$ac_cv_lib_xml2_xmlSaveToBuffer = xyes; then cat confdefs.h _ACEOF #define HAVE_LIBXML2 1 _ACEOF --- 9142,9150 conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi ! { $as_echo $as_me:$LINENO: result: $ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean 5 ! $as_echo $ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean 6; } ! if test x$ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean = xyes; then cat confdefs.h _ACEOF #define HAVE_LIBXML2 1 _ACEOF *** *** 9152,9159 LIBS=-lxml2 $LIBS else ! { { $as_echo $as_me:$LINENO: error: library 'xml2' (version = 2.6.23) is required for XML support 5 ! $as_echo $as_me: error: library 'xml2' (version = 2.6.23) is required for XML support 2;} { (exit 1); exit 1; }; } fi --- 9152,9159 LIBS=-lxml2 $LIBS else ! { { $as_echo $as_me:$LINENO: error: library 'xml2' (version = 2.6.27) is required for XML support 5 ! $as_echo $as_me: error: library 'xml2' (version = 2.6.27) is required for XML support 2;} { (exit 1); exit 1; }; } fi Index: configure.in === RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/configure.in,v retrieving revision 1.627 diff -c -r1.627 configure.in *** configure.in 13 May 2010 22:07:42 - 1.627 --- configure.in 25 May 2010 16:22:32 - *** *** 940,946 fi if test $with_libxml = yes ; then ! AC_CHECK_LIB(xml2, xmlSaveToBuffer, [], [AC_MSG_ERROR([library 'xml2' (version = 2.6.23) is required for XML support])]) fi if test $with_libxslt = yes ; then --- 940,946 fi if test $with_libxml = yes ; then ! AC_CHECK_LIB(xml2
Re: [HACKERS] ecmascript 5 DATESTYLE
Pavel Stehule wrote: 2010/5/19 Peter Eisentraut pete...@gmx.net: On tis, 2010-05-18 at 18:26 -0400, Ben Hockey wrote: ecmascript 5 is the most recent specification for JavaScript and i would think that having a DATESTYLE format to simplify interoperability with JavaScript applications would be highly desirable. Note that we haven't got any other datestyles that are intended to support interoperability with some language. It is usually the job of the client driver to convert PostgreSQL data (plural of datum) to the appropriate type and format for the client environment or language. Is there any reason why JavaScript would be different? I wouldn't be keen to see dedicated language specific handling of date/datetime formats. It would lead to an explosion of functions with new languages needing adding as and when their users jumped up and down on us. However a generic format could be very useful and would give the opportunity for people who need a language specific short cut the opportunity to do a CREATE FUNCTION wrapping the generic one with a hard coded format specifier. Other platforms have generic support for this kind of task, for example SQLServer: http://msdn.microsoft.com/en-us/library/ms187928.aspx. I wouldn't recommend the SQLServer way, I think numeric format specifiers are clumsy. Perhaps a mechanism like Java which is nicely summarized here: http://java.sun.com/j2se/1.5.0/docs/api/java/text/SimpleDateFormat.html Pavel: Why do you believe a generic format function would lead to SQL injections attacks? JavaScript isn't special language, but JSON is wide used format for interoperability. And same is true for XML datestyle format. Regards Pavel I think that the postgres handling of those data types should handle the date encoding themselves. For example, a XMLELEMENT call that was passed a date would format the date string to the xs:date format (e.g. 2010-05-19) and when passed a timestamp format to xs:datetime (e.g. 2010-05-19T09:29:52+01:00). I would see the JSON handling as being no different. Thanks, -- Mike Fowler Registered Linux user: 379787 I could be a genius if I just put my mind to it, and I, I could do anything, if only I could get 'round to it -PULP 'Glory Days' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
Pavel Stehule wrote: see google: lateral sql injection oracle NLS_DATE_FORMAT I would to like this functionality too - and technically I don't see a problem - It's less than 100 lines, but I don't need a new security problem. So my proposal is change nothing on this integrated functionality and add new custom date type - like cdate that can be customized via GUC. Regards Pavel OK I found www.databasesecurity.com/dbsec/lateral-sql-injection.pdf. From the way I read this, the exploit relies on adjusting the NLS_DATE_FORMAT to an arbitrary string which is then used for the attack, To me this is easy to code against, simply lock the date format right down and ensure that it is always controlled. IMHO I don't see an Oracle specific attack as a reason why we can't have a generic format. Surely we can learn from this known vulnerability and get another one up on Oracle? Thanks, -- Mike Fowler Registered Linux user: 379787 I could be a genius if I just put my mind to it, and I, I could do anything, if only I could get 'round to it -PULP 'Glory Days' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecmascript 5 DATESTYLE
Pavel Stehule wrote: 2010/5/19 Mike Fowler m...@mlfowler.com: Pavel Stehule wrote: see google: lateral sql injection oracle NLS_DATE_FORMAT I would to like this functionality too - and technically I don't see a problem - It's less than 100 lines, but I don't need a new security problem. So my proposal is change nothing on this integrated functionality and add new custom date type - like cdate that can be customized via GUC. Regards Pavel OK I found www.databasesecurity.com/dbsec/lateral-sql-injection.pdf. From the way I read this, the exploit relies on adjusting the NLS_DATE_FORMAT to an arbitrary string which is then used for the attack, To me this is easy to code against, simply lock the date format right down and ensure that it is always controlled. IMHO I don't see an Oracle specific attack as a reason why we can't have a generic format. Surely we can learn from this known vulnerability and get another one up on Oracle? I am not a security expert - you can simply don't allow apostrophe, double quotes - but I am not sure, if this can be safe - simply - I am abe to write this patch, but I am not able to ensure security. Regards Pavel Well you've rightly identified a potential security hole, so my recommendation would be to put the patch together bearing in mind the Oracle vulnerability. Once you've submitted the patch it can be reviewed and we can ensure that you've managed to steer clear of introducing the same/similar vulnerability into postgres. Am I right in thinking that you're now proposing to do the generic patch that Robert Haas and I prefer? Thanks, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Adding XML Schema validation (XMLVALIDATE)
Hi, I'm going to start work on another XML todo item: Add XML Schema validation and xmlvalidate function (SQL:2008) The standard identifies XMLVALIDATE as: XML validate ::= XMLVALIDATE left paren document or content or sequence XML value expression [ XML valid according to clause] right paren so I've got something quite clear to work too. libxml has the required support for schema validation so I'll just be wrapping it's functionality much like I did for xpath_exists(). Anyone got any thoughts before I get busy? Thanks, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding xpath_exists function
Robert Haas wrote: Please email your patch to the list (replying to this email is fine) and add it here: https://commitfest.postgresql.org/action/commitfest_view/open Here's my patch, developed against HEAD, that adds the function 'xpath_exists'. The function is a lot simpler than originally thought, so none of the string manipulation previously discussed was required. I've also included some regression tests that test the function with and without xml namespaces. I should note that before I added my tests all existing tests passed. One observation that can be made is that I've largely copied the existing xpath function and altered it to use a different method from the libxml API. I've done it to save me redoing all the namespace handling, however it's apparent to me that if we wanted to expose more of the libxml api we will quickly start having a lot of duplicate code. I notice that refactoring existing code whilst adding new code is generally frowned upon, so once this patch is accepted I will look to refactor the xpath and xpath_exists function. I could even add an xpath_count method at the same time ;) . Thanks in advance for any and all feedback, -- Mike Fowler Registered Linux user: 379787 I could be a genius if I just put my mind to it, and I, I could do anything, if only I could get 'round to it -PULP 'Glory Days' Index: src/backend/utils/adt/xml.c === RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v retrieving revision 1.97 diff -c -r1.97 xml.c *** src/backend/utils/adt/xml.c 3 Mar 2010 17:29:45 - 1.97 --- src/backend/utils/adt/xml.c 11 May 2010 07:54:53 - *** *** 3495,3497 --- 3495,3670 return 0; #endif } + + /* + * Determines if the node specified by the supplied XPath exists + * in a given XML document, returning a boolean. + * + * It is up to the user to ensure that the XML passed is in fact + * an XML document - XPath doesn't work easily on fragments without + * a context node being known. + */ + Datum + xpath_exists(PG_FUNCTION_ARGS) + { + #ifdef USE_LIBXML + text *xpath_expr_text = PG_GETARG_TEXT_P(0); + xmltype*data = PG_GETARG_XML_P(1); + ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2); + ArrayBuildState *astate = NULL; + xmlParserCtxtPtr ctxt = NULL; + xmlDocPtr doc = NULL; + xmlXPathContextPtr xpathctx = NULL; + xmlXPathCompExprPtr xpathcomp = NULL; + char *datastr; + int32 len; + int32 xpath_len; + xmlChar*string; + xmlChar*xpath_expr; + int i; + int res_nitems; + int ndim; + Datum *ns_names_uris; + bool *ns_names_uris_nulls; + int ns_count; + int result; + + /* + * Namespace mappings are passed as text[]. If an empty array is passed + * (ndim = 0, 0-dimensional), then there are no namespace mappings. + * Else, a 2-dimensional array with length of the second axis being equal + * to 2 should be passed, i.e., every subarray contains 2 elements, the + * first element defining the name, the second one the URI. Example: + * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2', + * 'http://example2.com']]. + */ + ndim = ARR_NDIM(namespaces); + if (ndim != 0) + { + int *dims; + + dims = ARR_DIMS(namespaces); + + if (ndim != 2 || dims[1] != 2) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(invalid array for XML namespace mapping), + errdetail(The array must be two-dimensional with length of the second axis equal to 2.))); + + Assert(ARR_ELEMTYPE(namespaces) == TEXTOID); + + deconstruct_array(namespaces, TEXTOID, -1, false, 'i', + ns_names_uris, ns_names_uris_nulls, + ns_count); + + Assert((ns_count % 2) == 0); /* checked above */ + ns_count /= 2; /* count pairs only */ + } + else + { + ns_names_uris = NULL; + ns_names_uris_nulls = NULL; + ns_count = 0; + } + + datastr = VARDATA(data); + len = VARSIZE(data) - VARHDRSZ; + xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ; + if (xpath_len == 0) + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg(empty XPath expression))); + + string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar)); + memcpy(string, datastr, len); + string[len] = '\0'; + + xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar)); + memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len); + xpath_expr[xpath_len] = '\0'; + + pg_xml_init(); + xmlInitParser(); + + PG_TRY(); + { + /* + * redundant XML parsing (two parsings for the same value during one + * command execution are possible) + */ + ctxt = xmlNewParserCtxt(); + if (ctxt == NULL) + xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate parser context); + doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0); + if (doc == NULL) + xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT, + could
Re: [HACKERS] Adding xpath_exists function
Robert Haas wrote: Oh, I see. Well, that might be reasonable syntactic sugar, although I think you should make it wrap the path in exists() unconditionally, rather than testing for an existing wrap. I'll leave it out for now, it saves me some effort after all. Please email your patch to the list (replying to this email is fine) and add it here: https://commitfest.postgresql.org/action/commitfest_view/open Will do once I've finished. Thanks for your feedback. Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Adding xpath_exists function
Hi hackers, Although this is a very small change I figured I'd practice the policy of outlining your change before you write the code and attempt a patch submission. Essentially I see the function as a convenience function that exposes the results of the xpath built in exists() function as a boolean for easier SQL. The syntax will match the xpath function already present: xpath_exists(xpath, xml[, nsarray]) The implementation will check that the xpath value contains 'exists(.)' and add it if not present for added usability. I can't blindly wrap the value with exists(...) as exists(exists(.)) will always return true in xpath. I've not allocated the oid yet, but will try to earn my bonus points for getting it close to xpath()'s oid. :) Appropriate regression tests will be added after I've ensured that 'make check' is happy I've not regressed anything. Can I confirm that contrib/xml2 is deprecated and I should be carrying out my work in backend/utils/adt/xml.c? I shall be coding it up over the next day or two, work family permitting! Thanks, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] XML Todo List
Hackers, I'm interested in tackling some of the todo items in XML category. Being new to postgres hacking I'm hoping I chose an item that isn't more than I can chew in the first sitting. One item that has caught my eye that I (naively) hope isn't a huge todo is: xpath_exists() is needed. It checks, whether or not the path specified exists in the XML value. (W/o this function we need to use weird array_dims(xpath(...)) IS NOT NULL syntax.) Is any one else working on the XML todos who might have some friendly pointers to help me on my way or am I just better off getting some code together for general review? Thanks, -- Mike Fowler Registered Linux user: 379787 I could be a genius if I just put my mind to it, and I, I could do anything, if only I could get 'round to it -PULP 'Glory Days' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XML Todo List
Peter Eisentraut wrote: On ons, 2010-04-28 at 15:21 +0100, Mike Fowler wrote: xpath_exists() is needed. It checks, whether or not the path specified exists in the XML value. (W/o this function we need to use weird array_dims(xpath(...)) IS NOT NULL syntax.) That sounds like a reasonable project. Is any one else working on the XML todos who might have some friendly pointers to help me on my way or am I just better off getting some code together for general review? I think you're it. :) Great, I won't be stepping on any toes! -- Mike Fowler Registered Linux user: 379787 I could be a genius if I just put my mind to it, and I, I could do anything, if only I could get 'round to it -PULP 'Glory Days' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] XML Todo List
Hackers, I'm interested in tackling some of the todo items in XML category. Being new to postgres hacking I'm hoping I chose an item that isn't more than I can chew in the first sitting. One item that has caught my eye that I (naively) hope isn't a huge todo is: xpath_exists() is needed. It checks, whether or not the path specified exists in the XML value. (W/o this function we need to use weird array_dims(xpath(...)) IS NOT NULL syntax.) Is any one else working on the XML todos who might have some friendly pointers to help me on my way or am I just better off getting some code together for general review? Thanks, -- Mike Fowler Registered Linux user: 379787 I could be a genius if I just put my mind to it, and I, I could do anything, if only I could get 'round to it -PULP 'Glory Days' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers