Re: [JDBC] [HACKERS] JDBC connections to 9.1

2011-04-18 Thread Mike Fowler

On 18/04/11 15:57, Tom Lane wrote:

Bernd Helmlemaili...@oopsware.de  writes:

If i am reading it correct, it reads UTF8 from the backend, while
expecting UNICODE only. Not sure what change has caused this,
though.

I am --- when I redid the GUC assign_hook logic a few weeks ago,
I changed the client_encoding code so that it shows the normalized
(official) name of the encoding, not whatever random string the client
sent over.  For instance, previous versions:

regression=# set client_encoding = 'UnIcOdE';
SET
regression=# show client_encoding ;
  client_encoding
-
  UnIcOdE
(1 row)

versus HEAD:

regression=# set client_encoding = 'UnIcOdE';
SET
regression=# show client_encoding ;
  client_encoding
-
  UTF8
(1 row)

I wasn't aware that JDBC would fail on that.  It's pretty annoying that
it does, but maybe we should grin and bear it, ie revert the change to
canonicalize the GUC's value?

regards, tom lane

Am I right in thinking that would be that change committed on the 7th 
(http://archives.postgresql.org/pgsql-committers/2011-04/msg00039.php) ? 
I've just run the JDBC test build on my machine and it fails dismally 
with this very message repeated over and over again. What concerns me 
most is that (assuming my dates are right) the JDBC driver has been 
broken for 11 days and no one noticed. This would lead me to believe 
that there is no JDBC build server. What would it take to set one up? If 
someone can point me to a test machine I'd happily assist in setting one up.


As for the breakage itself I'm OK with a new driver version for a new 
database version and from my experience people expect that. I recall a 
number of people asking me if an 8.4 driver would be OK to use against 9 
before the 9 version was stable.


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [JDBC] [HACKERS] JDBC connections to 9.1

2011-04-18 Thread Mike Fowler

On 18/04/11 17:35, Andrew Dunstan wrote:



On 04/18/2011 11:25 AM, Tom Lane wrote:


What concerns me  most is that (assuming my dates are right) the 
JDBC driver has been

broken for 11 days and no one noticed. This would lead me to believe
that there is no JDBC build server. What would it take to set one up?

+1 for doing something along that line.




All you'd need to do is write a step for a buildfarm animal to fetch 
the JDBC driver and run some tests, and run it in a buildfarm client 
somewhere. The server code is quite agnostic about the steps that are 
reported on.


IOW in addition to a running buildfarm member you need to write a 
small amount ( 100 lines, possibly much less) of perl code.


cheers

andrew


I've found the entry on the Developer Wiki 
(http://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto). What I'll 
do is set-up three farms on my machine - one for 1.4, one for 1.5 and 
one for 1.6. It's been a while since I've had an excuse to write some 
Perl! I can't guarantee when I'll have it done as I'm away for a little 
over a week from Wednesday and I'm not allowed internet access!


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [JDBC] [HACKERS] JDBC connections to 9.1

2011-04-18 Thread Mike Fowler

On 18/04/11 17:12, Tom Lane wrote:

Dave Cramerp...@fastcrypt.com  writes:

Well initially my concern was that people would have a challenge in
the case where they had to re-certify their application if we made
this change, however I realize they will have to do this anyway since
upgrading to 9.1 is what necessitates it.

I don't see any backwards compatibility risk, if that's what you mean.
Every backend release since 7.3 has treated client_encoding 'UTF8' and
'UNICODE' the same, and earlier releases didn't accept either one.

regards, tom lane



As there seems to be a consensus forming for fixing the JDBC driver, 
I've taken the liberty do so at the risk of being shot down. The patch 
is fairly straightforward, just changing UNICODE to UTF8 in a number of 
files including the translation files. I've tested this against 9.1devel 
(HEAD) and 8.4.7. For each database version I build and the tests 
running JDKs 1.4.2_19, 1.5.0_22 and 1.6.0_2. All on 32-bit.


Regards,

--
Mike Fowler
Registered Linux user: 379787

Index: doc/pgjdbc.xml
===
RCS file: /cvsroot/jdbc/pgjdbc/doc/pgjdbc.xml,v
retrieving revision 1.40
diff -c -r1.40 pgjdbc.xml
*** doc/pgjdbc.xml	25 Dec 2010 07:07:44 -	1.40
--- doc/pgjdbc.xml	18 Apr 2011 16:32:49 -
***
*** 179,185 
  encoding and you will have problems the moment you store data in it that
  does not fit in the seven bit acronymASCII/acronym character set.
  If you do not know what your encoding will be or are otherwise unsure
! about what you will be storing the literalUNICODE/literal encoding
  is a reasonable default to use.
 /para
/sect1
--- 179,185 
  encoding and you will have problems the moment you store data in it that
  does not fit in the seven bit acronymASCII/acronym character set.
  If you do not know what your encoding will be or are otherwise unsure
! about what you will be storing the literalUTF8/literal encoding
  is a reasonable default to use.
 /para
/sect1
Index: org/postgresql/core/BaseConnection.java
===
RCS file: /cvsroot/jdbc/pgjdbc/org/postgresql/core/BaseConnection.java,v
retrieving revision 1.23
diff -c -r1.23 BaseConnection.java
*** org/postgresql/core/BaseConnection.java	1 May 2010 14:40:51 -	1.23
--- org/postgresql/core/BaseConnection.java	18 Apr 2011 16:32:49 -
***
*** 96,102 
  
  /**
   * Encode a string using the database's client_encoding
!  * (usually UNICODE, but can vary on older server versions).
   * This is used when constructing synthetic resultsets (for
   * example, in metadata methods).
   *
--- 96,102 
  
  /**
   * Encode a string using the database's client_encoding
!  * (usually UTF8, but can vary on older server versions).
   * This is used when constructing synthetic resultsets (for
   * example, in metadata methods).
   *
Index: org/postgresql/core/v2/ConnectionFactoryImpl.java
===
RCS file: /cvsroot/jdbc/pgjdbc/org/postgresql/core/v2/ConnectionFactoryImpl.java,v
retrieving revision 1.21
diff -c -r1.21 ConnectionFactoryImpl.java
*** org/postgresql/core/v2/ConnectionFactoryImpl.java	31 Mar 2011 03:06:38 -	1.21
--- org/postgresql/core/v2/ConnectionFactoryImpl.java	18 Apr 2011 16:32:49 -
***
*** 380,388 
  // 7.3 server that defaults to autocommit = off.
  
  if (logger.logDebug())
! logger.debug(Switching to UNICODE client_encoding);
  
! String sql = begin; set autocommit = on; set client_encoding = 'UNICODE'; ;
  if (dbVersion.compareTo(9.0) = 0) {
  sql += SET extra_float_digits=3; ;
  } else if (dbVersion.compareTo(7.4) = 0) {
--- 380,388 
  // 7.3 server that defaults to autocommit = off.
  
  if (logger.logDebug())
! logger.debug(Switching to UTF8 client_encoding);
  
! String sql = begin; set autocommit = on; set client_encoding = 'UTF8'; ;
  if (dbVersion.compareTo(9.0) = 0) {
  sql += SET extra_float_digits=3; ;
  } else if (dbVersion.compareTo(7.4) = 0) {
***
*** 391,397 
  sql += commit;
  
  SetupQueryRunner.run(protoConnection, sql, false);
! protoConnection.setEncoding(Encoding.getDatabaseEncoding(UNICODE));
  }
  else
  {
--- 391,397 
  sql += commit;
  
  SetupQueryRunner.run(protoConnection, sql, false);
! protoConnection.setEncoding(Encoding.getDatabaseEncoding(UTF8));
  }
  else
  {
Index: org/postgresql/core/v3/ConnectionFactoryImpl.java

Re: [HACKERS] PG 9.0.3. How to select rows from xml

2011-04-04 Thread Mike Fowler

Hi,

On 02/04/11 20:47, emanov wrote:

Hi all!
What i need is transform xml document to table like that:
insert into tmp(Name, Value)
select t.Name, t.Value from myxml.Nodes(/doc/person) as
t('Name:varchar|Value:int')
or similar. In fact I have many rows with many columns.

How I can do it with PG 9.0.3 where I can't find xpath_table?



Though deprecated, the xml2 contrib module with xpath_table is still 
present. Installation instructions can be found on 
http://www.postgresql.org/docs/9.0/static/contrib.html and the module is 
documented at http://www.postgresql.org/docs/9.0/static/xml2.html.


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new keywords in 9.1

2011-03-15 Thread Mike Fowler

On 12/03/11 05:18, Robert Haas wrote:

XMLEXISTS is pretty horrible in that the
syntax apparently requires three new keywords (XMLEXISTS, PASSING,
REF) which is pretty lame but I guess it's specified by the standard
so I'm not sure there's much we can do about it.  The rest look
reasonable and necessary AFAICT
I can confirm that I added the three keywords as described in the 
SQL/XML standard (section 8.4). Apologies for the delayed confirmation, 
I missed the thread when it was started and only noticed when your 
commit message arrived.


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Native XML

2011-02-27 Thread Mike Fowler

On 27/02/11 19:37, David E. Wheeler wrote:

On Feb 27, 2011, at 11:23 AM, Tom Lane wrote:


Well, that's why I asked --- if it's going to be a huge chunk of code,
then I agree this is the wrong path to pursue.  However, I do feel that
libxml pretty well sucks, so if we could replace it with a relatively
small amount of code, that might be the right thing to do.

I think that XML parsers must be hard to get really right, because of all those 
I've used in Perl, XML::LibXML is far and away the best. Its docs suck, but it 
does the work really well.

No, because the xpath stuff is fundamentally broken, and nobody seems to
know how to make libxslt do what we actually need.  See the open bugs
on the TODO list.

XPath is broken? I use it heavily in the Perl module Test::XPath and now, in 
PostgreSQL, with my explanation extension.

   http://github.com/theory/explanation/

Is this something I need to worry about
I don't believe that XPath is fundamentally broken, but I think Tom 
may have meant xslt. When reviewing a recent patch to xml2/xslt I found 
a few bugs in the way were using libxslt, as well as a bug in the 
library itself (see 
http://archives.postgresql.org/pgsql-hackers/2011-02/msg01878.php).


However if Tom does mean that xpath is the culprit, it may be with the 
way the libxml2 library works. It's a very messy singleton. If I'm 
wrong, I'm sure I'll be corrected!


Regards,
--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First patch proposal

2010-11-03 Thread Mike Fowler

On 14/10/10 15:53, Alastair Turner wrote:

It isn't a TODO item, or related to any previous thread I could find.
   
It's certainly something I can see a use for. When I'm having a bad 
typing day I get annoyed that I find I've made a mistake after I've 
typed the password. To me this is a feature that will just make life a 
little more pleasant for command line junkies like me.


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] XML schema validation

2010-10-15 Thread Mike Fowler

On 15/10/10 15:08, Tomáš Pospíšil wrote:

Hi Robert,

I would like to contribute for community. Under licence used by PostgreSQL.

So is (or was) there anybody with the same interest as me?
   
Yes, I've been looking at improving the XML support overall, or at least 
working to finish the implementation of the SQL 2008 XML specification. 
Currently I've been looking at implementing XQuery, however I have 
recently started to come to the conclusion that Schema validation should 
be implemented first as there is some overlap of functionality.


What have you looked at? Do you have an outline design you could share 
on how you would go about adding schema validation?


Regrads,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First patch proposal

2010-10-15 Thread Mike Fowler

On 14/10/10 15:53, Alastair Turner wrote:

It isn't a TODO item, or related to any previous thread I could find.
   


It's certainly something I can see a use for. When I'm having a bad 
typing day I get annoyed that I find I've made a mistake after I've 
typed the password. To me this is a feature that will just make life a 
little more pleasant for command line junkies like me.


Regards,

--
Mike Fowler
Registered Linux user: 379787

NB: Post attmpt two, yesterday's was never delievered to hackers so apologies 
if Alastair and Hitoshi have seen this message once already.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Mike Fowler

On 13/08/10 10:45, Joseph Adams wrote:

This patch doesn't include tests .  How would I go about writing them?

I have made the JSON data type built-in, and I will post that patch
shortly (it depends on this one).  The built-in JSON data type uses
all of these utility functions, and the tests for the JSON data type
pass.
   


Joseph,

Most existing data types have a regression SQL script in 
src/test/regress/sql. Using one of them as an example to draw some 
inspiration from, you should be able to write a 'json.sql' script that 
exercises the data type and it's supporting functions. Full instructions 
can be found at http://wiki.postgresql.org/wiki/Regression_test_authoring


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-08-11 Thread Mike Fowler

On 11/08/10 21:27, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

On Mon, Aug 9, 2010 at 10:41 AM, Robert Haasrobertmh...@gmail.com  wrote:

There's also the fact that it would probably end up parsing the data
twice.  Given xmloption, I'm inclined to think Tom has it right:
provided xml_is_well_formed() that follows xmloption, plus a specific
version for each of content and document.



Another reasonable option here would be to forget about having
xml_is_well_formed() per se and ONLY offer
xml_is_well_formed_content() and xml_is_well_formed_document().


We already have xml_is_well_formed(); just dropping it doesn't seem like
a helpful choice.


As a project management note, this CommitFest is over in 4 days, so
unless we have a new version of this patch real soon now we need to
defer it to the September 15th CommitFest


Yes.  Mike, are you expecting to submit a new version before the end of
the week?



Yes and here it is, apologies for the delay. I have re-implemented 
xml_is_well_formed such that it is sensitive to the XMLOPTION. The 
additional _document and _content methods are now present. Tests and 
documentation adjusted to suit.


Regards,

--
Mike Fowler
Registered Linux user: 379787
*** a/contrib/xml2/pgxml.sql.in
--- b/contrib/xml2/pgxml.sql.in
***
*** 5,18  SET search_path = public;
  
  --SQL for XML parser
  
- CREATE OR REPLACE FUNCTION xml_is_well_formed(text) RETURNS bool
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT IMMUTABLE;
- 
  -- deprecated old name for xml_is_well_formed
  CREATE OR REPLACE FUNCTION xml_valid(text) RETURNS bool
! AS 'MODULE_PATHNAME', 'xml_is_well_formed'
! LANGUAGE C STRICT IMMUTABLE;
  
  CREATE OR REPLACE FUNCTION xml_encode_special_chars(text) RETURNS text
  AS 'MODULE_PATHNAME'
--- 5,14 
  
  --SQL for XML parser
  
  -- deprecated old name for xml_is_well_formed
  CREATE OR REPLACE FUNCTION xml_valid(text) RETURNS bool
! AS 'xml_is_well_formed'
! LANGUAGE INTERNAL STRICT IMMUTABLE;
  
  CREATE OR REPLACE FUNCTION xml_encode_special_chars(text) RETURNS text
  AS 'MODULE_PATHNAME'
*** a/contrib/xml2/uninstall_pgxml.sql
--- b/contrib/xml2/uninstall_pgxml.sql
***
*** 29,33  DROP FUNCTION xml_encode_special_chars(text);
  
  -- deprecated old name for xml_is_well_formed
  DROP FUNCTION xml_valid(text);
- 
- DROP FUNCTION xml_is_well_formed(text);
--- 29,31 
*** a/contrib/xml2/xpath.c
--- b/contrib/xml2/xpath.c
***
*** 27,33  PG_MODULE_MAGIC;
  
  /* externally accessible functions */
  
- Datum		xml_is_well_formed(PG_FUNCTION_ARGS);
  Datum		xml_encode_special_chars(PG_FUNCTION_ARGS);
  Datum		xpath_nodeset(PG_FUNCTION_ARGS);
  Datum		xpath_string(PG_FUNCTION_ARGS);
--- 27,32 
***
*** 71,97  pgxml_parser_init(void)
  }
  
  
- /* Returns true if document is well-formed */
- 
- PG_FUNCTION_INFO_V1(xml_is_well_formed);
- 
- Datum
- xml_is_well_formed(PG_FUNCTION_ARGS)
- {
- 	text	   *t = PG_GETARG_TEXT_P(0);		/* document buffer */
- 	int32		docsize = VARSIZE(t) - VARHDRSZ;
- 	xmlDocPtr	doctree;
- 
- 	pgxml_parser_init();
- 
- 	doctree = xmlParseMemory((char *) VARDATA(t), docsize);
- 	if (doctree == NULL)
- 		PG_RETURN_BOOL(false);	/* i.e. not well-formed */
- 	xmlFreeDoc(doctree);
- 	PG_RETURN_BOOL(true);
- }
- 
- 
  /* Encodes special characters (, , ,  and \r) as XML entities */
  
  PG_FUNCTION_INFO_V1(xml_encode_special_chars);
--- 70,75 
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8625,8630  SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownTor
--- 8625,8736 
   supports XPath, which is a subset of XQuery.
  /para
 /sect3
+ 
+sect3
+ titlexml_is_well_formed/title
+ 
+ indexterm
+  primaryxml_is_well_formed/primary
+  secondarywell formed/secondary
+ /indexterm
+ 
+ synopsis
+ functionxml_is_well_formed/function(replaceabletext/replaceable)
+ /synopsis
+ 
+ para
+  The function functionxml_is_well_formed/function evaluates whether
+  the replaceabletext/replaceable is well formed XML content, returning
+  a boolean. It is useful for predetermining whether a cast to the XML type
+  will succeed, and therefore honours the current value of the
+  literalXMLOPTION/literal session configuration parameter.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SET xmloption TO DOCUMENT;
+ SELECT xml_is_well_formed('');
+  xml_is_well_formed 
+ 
+  f
+ (1 row)
+ 
+ SELECT xml_is_well_formed('abc/');
+  xml_is_well_formed 
+ 
+  t
+ (1 row)
+ 
+ SET xmloption TO CONTENT;
+ SELECT xml_is_well_formed('abc');
+  xml_is_well_formed 
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+ para
+ In addition to the structure checks, the function ensures that namespaces are correcty matched.
+ screen![CDATA[
+ SET xmloption TO DOCUMENT;
+ SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org

Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-09 Thread Mike Fowler

On 09/08/10 04:07, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

On Sun, Aug 8, 2010 at 11:36 AM, Mike Fowlerm...@mlfowler.com  wrote:

1) XML2 is largely undocumented, giving rise to the problems encountered.
Since the module is deprecated anyways, does it make more sense to get xslt
handling moved into core and get it fully documented?



Yes, I think that would be better.


I'm hesitant to consider pulling this into core when there's so little
consensus on how it ought to act.  It'd be better to have a solid,
widely used contrib module *first*, rather than imagine that pulling it
into core is somehow a cure for its problems.


Perhaps the first step forward is to pull xslt_process out of xml2 and 
create a standalone xslt contrib module. Then at least it can lose the 
stigma of being in a deprecated module and perhaps make it more visible 
to users.





2) Pavel's regression test exposes a bug in libxslt. The stylesheet declares
5 parameters, but uses 12. Simplifying, take the stylesheet:



I'm not sure whether there's anything we can do about this.


We should file a bug report with the libxslt authors, obviously.


Turns out the bug was filed in 2005 (see 
https://bugzilla.gnome.org/show_bug.cgi?id=307061). They are currently 
taking a fairly loose interpretation of the XSLT spec. However that was 
only one aspect of the concern. The other was that no errors were being 
reported back in psql when the libxslt is generating errors. Is this 
desirable?



Regards,
--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-08 Thread Mike Fowler

On 06/08/10 17:50, Pavel Stehule wrote:


attached updated patch with regression test







Bravely ignoring the quotation/varidic/favourite_scheme_here 
conversations, I've taken a look at the patch as is. Thanks to Tom's 
input I can now correctly drive the function. I can also report that 
code is now behaving in the expected way.


I have two other observations, more directed at the community than Pavel:

1) XML2 is largely undocumented, giving rise to the problems 
encountered. Since the module is deprecated anyways, does it make more 
sense to get xslt handling moved into core and get it fully documented?


2) Pavel's regression test exposes a bug in libxslt. The stylesheet 
declares 5 parameters, but uses 12. Simplifying, take the stylesheet:


xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; 
version=1.0

   xsl:output method=xml omit-xml-declaration=yes indent=yes/
   xsl:strip-space elements=*/
   xsl:param name=n1/
   xsl:template match=*
 xsl:element name=samples
   xsl:element name=sample
 xsl:value-of select=$n1/
   /xsl:element
   xsl:element name=sample
 xsl:value-of select=$n2/
   /xsl:element
 /xsl:element
   /xsl:template
/xsl:stylesheet

and run the command:

~/temp$ xsltproc --stringparam n2 v2 Untitled2.xsl Untitled1.xml
samples
  sample/
  samplev2/sample
/samples

All looks good. However if you run:

~/temp$ xsltproc --stringparam n1 v1 Untitled2.xsl Untitled1.xml
runtime error: file Untitled2.xsl line 28 element value-of
Variable 'n2' has not been declared.
xmlXPathCompiledEval: evaluation failed
runtime error: file Untitled2.xsl line 28 element value-of
XPath evaluation returned no result.
samples
  samplev1/sample
  sample/
/samples

The xslt_process function ignores these errors and returns cleanly.

To summarize, the bug in libxslt is that it allows the processing of 
unnamed parameters - most other parsers would reject this stylesheet. 
Secondly, the xslt_process does not return the errors reported when 
running without passing the unnamed parameter.


Personally I would like to see this documented and moved to core so that 
the whole of xml2 can be dropped. I also think that the errors should be 
reported, even if libxslt doesn't behave properly in all scenarios.


Of course there's that whole other issue around how you pass the 
parameters in the first place that needs resolving too...


Regards,
--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-07 Thread Mike Fowler

On 06/08/10 20:55, Peter Eisentraut wrote:

On fre, 2010-08-06 at 09:04 +0100, Mike Fowler wrote:

If the patch is to be committed, does it make sense for me to refine
it such that it uses the new xpath internal function you extracted in
the xmlexists patch?


Yes, you can probably shrink this patch down to about 20 lines.



Updated the patch so that it will apply to head and re-worked the 
function to use the new xpath internal function.


Regards,

--
Mike Fowler
Registered Linux user: 379787
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8693,8698  SELECT xpath('//mydefns:b/text()', 'a xmlns=http://example.com;btest/b/a
--- 8693,8731 
  (1 row)
  ]]/screen
 /para
+ 
+sect3
+ titlexpath_exists/title
+ 
+ indexterm
+  primaryxpath_exists/primary
+ /indexterm
+ 
+ synopsis
+  functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional)
+ /synopsis
+ 
+ para
+  The function functionxpath_exists/function is a specialised form
+  of the functionxpath/function function. Though the functions are
+  syntactically the same the xpath expressions are evaluated in differing
+  contexts. Instead of returning the XML values that satisfy the xpath, this
+  function returns a boolean indicating whether the query was satisfied or not.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', 
+  ARRAY[ARRAY['my', 'http://example.com']]);
+ 
+  xpath_exists  
+ 
+  t
+ (1 row)
+ ]]/screen
+  /para
+ /sect3
/sect2
  
sect2 id=functions-xml-mapping
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3541,3543  Datum xmlexists(PG_FUNCTION_ARGS)
--- 3541,3567 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean. Differs from
+  * xmlexists as it supports namespaces and is not defined in SQL/XML.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	int			res_nitems;
+ 
+ 	xpath_internal(xpath_expr_text, data, namespaces,
+    res_nitems, NULL);
+ 
+ 	PG_RETURN_BOOL(res_nitems  0);
+ #else
+ 	NO_XML_SUPPORT();
+ 	return 0;
+ #endif
+ }
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***
*** 4390,4395  DESCR(evaluate XPath expression);
--- 4390,4400 
  DATA(insert OID = 2614 (  xmlexists	 PGNSP PGUID 12 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ xmlexists _null_ _null_ _null_ ));
  DESCR(test XML value against XPath expression);
  
+ DATA(insert OID = 3037 (  xpath_exists	 PGNSP PGUID 12 1 0 0 f f f t f i 3 0 16 25 142 1009 _null_ _null_ _null_ _null_ xpath_exists _null_ _null_ _null_ ));
+ DESCR(evaluate XPath expression in a boolean context, with namespaces support);
+ DATA(insert OID = 3038 (  xpath_exists	 PGNSP PGUID 14 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ select pg_catalog.xpath_exists($1, $2, ''{}''::pg_catalog.text[]) _null_ _null_ _null_ ));
+ DESCR(evaluate XPath expression in a boolean context);
+ 
  /* uuid */
  DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2950 2275 _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ ));
  DESCR(I/O);
*** a/src/include/utils/xml.h
--- b/src/include/utils/xml.h
***
*** 37,42  extern Datum texttoxml(PG_FUNCTION_ARGS);
--- 37,43 
  extern Datum xmltotext(PG_FUNCTION_ARGS);
  extern Datum xmlvalidate(PG_FUNCTION_ARGS);
  extern Datum xpath(PG_FUNCTION_ARGS);
+ extern Datum xpath_exists(PG_FUNCTION_ARGS);
  extern Datum xmlexists(PG_FUNCTION_ARGS);
  
  extern Datum table_to_xml(PG_FUNCTION_ARGS);
*** a/src/test/regress/expected/xml.out
--- b/src/test/regress/expected/xml.out
***
*** 502,507  SELECT xpath('//b', 'aone btwo/b three betc/b/a');
--- 502,560 
   {btwo/b,betc/b}
  (1 row)
  
+ -- Test xpath_exists evaluation
+ SELECT xpath_exists('//town[text() = ''Toronto'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
+  xpath_exists 
+ --
+  f
+ (1 row)
+ 
+ SELECT xpath_exists('//town[text() = ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
+  xpath_exists 
+ --
+  t
+ (1 row)
+ 
+ INSERT INTO xmltest VALUES (4, 'menubeersnameBudvar/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml);
+ INSERT INTO xmltest VALUES (5, 'menubeersnameMolson/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml);
+ INSERT INTO xmltest VALUES (6, 'myns:menu xmlns:myns=http://myns.com;myns:beersmyns:nameBudvar

Re: [HACKERS] review: xml_is_well_formed

2010-08-07 Thread Mike Fowler

On 06/08/10 21:55, Peter Eisentraut wrote:

On fre, 2010-08-06 at 14:43 +0100, Mike Fowler wrote:

Or perhaps it could return a string instead of a boolean: content,
document, or NULL if it's neither.



I like the sound of that. In fact this helps workaround the IS
DOCUMENT
and IS CONTENT limitations such that you can you can select only
content, only documents or both is you use IS NOT NULL.

Unless anyone sees a reason that this function needs to remain a
boolean function, I'll rework the patch over the weekend.


What is the actual use case for this function?  Is the above behavior
actually useful?


The idea is to be able to filter a table that contains XML in TEXT that 
might not be well formed. Knowing that you're only dealing with well 
formed XML prevents you blowing up when you attempt the cast.




One reason to stick with boolean is backward compatibility.



To be honest I'm happiest with returning a boolean, even if there is 
some confusion over content only being valid. Though changing the return 
value to DOCUMENT/CONTENT/NULL makes things a touch more explicit, the 
same results can be achieved by simply running:


SELECT data::xml FROM mixed WHERE xml_is_well_formed(data) AND data::xml 
IS DOCUMENT;


Regards,
--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-06 Thread Mike Fowler

On 06/08/10 05:38, Peter Eisentraut wrote:

On tis, 2010-07-27 at 16:33 -0700, David Fetter wrote:
   

* Do we already have it?

 Not really.  There are kludges to accomplish these things, but
 they're available mostly in the sense that a general-purpose
 language allows you to write code to do anything a Turing machine
 can do.
 

I think this has been obsoleted by the xmlexists patch


In many ways yes. The only surviving difference is that xpath_exists has 
support for namespaces and xmlexists does not as the grammar expects 
namespaces to be handled in the xquery. So if people expect namespace 
support to be useful that having both functions is useful until I (or 
someone who works faster than me) get xquery going.


If the patch is to be committed, does it make sense for me to refine it 
such that it uses the new xpath internal function you extracted in the 
xmlexists patch?


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-08-06 Thread Mike Fowler

On 03/08/10 16:15, Peter Eisentraut wrote:

On lör, 2010-07-31 at 13:40 -0400, Robert Haas wrote:

Well-formedness should probably only allow XML documents.


I think the point of this function is to determine whether a cast to
xml will throw an error.  The behavior should probably match exactly
whatever test would be applied there.


Maybe there should be

xml_is_well_formed()
xml_is_well_formed_document()
xml_is_well_formed_content()

I agree that consistency with SQL/XML is desirable, but for someone
coming from the outside, the unqualified claim that 'foo' is well-formed
XML might sound suspicious.


What about making the function sensitive to the XML OPTION, such that:

test=# SET xmloption TO DOCUMENT;
SET
text=# SELECT xml_is_well_formed('foo');

  xml_is_well_formed
 
  f
 (1 row)

test=# SET xmloption TO CONTENT;
SET
text=# SELECT xml_is_well_formed('foo');

  xml_is_well_formed
 
  t
 (1 row)

with the inverse for DOCUMENTS? To me this makes the most sense as it makes the 
function behave much more like the other xml functions.

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-08-06 Thread Mike Fowler

On 06/08/10 12:31, Robert Haas wrote:


Maybe there should be

xml_is_well_formed()
xml_is_well_formed_document()
xml_is_well_formed_content()

I agree that consistency with SQL/XML is desirable, but for someone
coming from the outside, the unqualified claim that 'foo' is well-formed
XML might sound suspicious.
   

What about making the function sensitive to the XML OPTION, such that:

test=# SET xmloption TO DOCUMENT;
SET
text=# SELECT xml_is_well_formed('foo');

  xml_is_well_formed
  
  f
  (1 row)
 

That will make using this function a huge hassle, won't it?  Functions
that do different things depending on GUC settings are usually
troublesome.  Having three functions would be more sensible if we need
all three behaviors, but I don't see why we do.

Or perhaps it could return a string instead of a boolean: content,
document, or NULL if it's neither.
   


I like the sound of that. In fact this helps workaround the IS DOCUMENT 
and IS CONTENT limitations such that you can you can select only 
content, only documents or both is you use IS NOT NULL.


Unless anyone sees a reason that this function needs to remain a boolean 
function, I'll rework the patch over the weekend.


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-06 Thread Mike Fowler

On 06/08/10 15:08, Andrew Dunstan wrote:



On 08/06/2010 02:29 AM, Pavel Stehule wrote:

2010/8/6 David Fetterda...@fetter.org:

On Fri, Aug 06, 2010 at 05:57:37AM +0200, Pavel Stehule wrote:

2010/8/6 Andrew Dunstanand...@dunslane.net:

On 08/05/2010 06:56 PM, Mike Fowler wrote:

SELECT
xslt_process('employeenamecim/nameage30/agepay400/pay/employee'::text, 


$$xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform;
version=1.0
xsl:output method=xml omit-xml-declaration=yes indent=yes/


[snip]

/xsl:stylesheet$$::text, 'n1=v1,n2=v2,n3=v3,n4=v4,n5=v5'::text)
I haven't been paying attention to this, so sorry if this has been 
discussed
before, but it just caught my eye. Why are we passing these params 
as a

comma-separated list rather than as an array or as a variadic list of
params? This looks rather ugly. What if you want to have a param that
includes a comma?

There is probably problem in pairs - label = value. Can be nice, if we
can use a variadic functions for this, but I am afraid, ...

using a variadic function isn't too much nice now

some xslt_process(xmlsrc, 'n1=v1','n2=v2','n3=v3'

This sounds like the perfect case for pulling hstore into core code. :)

I afraid so integration of hstore can break and block work on real
hash support. I would to have hash tables in core, but with usual
features and usual syntax - like Perl or PHP



Can we just keep this discussion within reasonable bounds? The issue 
is not hstore or other hashes, but how to do the param list for xslt 
sanely given what we have now. A variadic list will be much nicer than 
what is currently proposed.


cheers

andrew


+1

Variadic seems the most sensible to me anyways.

However the more urgent problem is, pending someone spotting an error in 
my ways, neither the existing code or the patched code appear able to 
evaluate the parameters. Note than in my example I supplied a default 
value for the fifth parameter and not even that value is appearing in 
the outputted XML.


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-05 Thread Mike Fowler

Hi Pavel,

On 02/08/10 09:21, Pavel Stehule wrote:

Hello

2010/8/2 Mike Fowlerm...@mlfowler.com:

Hi Pavel,

Currently your patch isn't applying to head, from the looks of things a
function signature has changed. Can you update your patch please?



yes - see attachment



Thanks, the new patch applies cleanly. However I've been attempting to 
run the parameterised XSLT this evening but to no avail. Reverting your 
code I've discovered that it does not work in the old version either.


Given the complete lack of documentation (not your fault) it's always 
possible that I'm doing something wrong. Given the query below, you 
should get the XML that follows, and indeed in oXygen (a standalone XML 
tool) you do:


SELECT 
xslt_process('employeenamecim/nameage30/agepay400/pay/employee'::text, 
$$xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; 
version=1.0

   xsl:output method=xml omit-xml-declaration=yes indent=yes/
   xsl:strip-space elements=*/
   xsl:param name=n1/
   xsl:param name=n2/
   xsl:param name=n3/
   xsl:param name=n4/
   xsl:param name=n5 select='me'/
   xsl:template match=*
 xsl:element name=samples
   xsl:element name=sample
 xsl:value-of select=$n1/
   /xsl:element
   xsl:element name=sample
 xsl:value-of select=$n2/
   /xsl:element
   xsl:element name=sample
 xsl:value-of select=$n3/
   /xsl:element
   xsl:element name=sample
 xsl:value-of select=$n4/
   /xsl:element
   xsl:element name=sample
 xsl:value-of select=$n5/
   /xsl:element
 /xsl:element
   /xsl:template
/xsl:stylesheet$$::text, 'n1=v1,n2=v2,n3=v3,n4=v4,n5=v5'::text)

samples
  samplev1/sample
  samplev2/sample
  samplev3/sample
  samplev4/sample
  samplev5/sample
/samples

Sadly I get the following in both versions:

samples
  sample/
  sample/
  sample/
  sample/
  sample/
/samples


Unless you can see anything I'm doing wrong I suggest we mark this patch 
either 'Returned with feedback' or 'Rejected'. Since contrib/xml2 is 
deprecated, perhaps a better way forward is to pull XSLT handling into 
core and fix both the apparent inability to handle parameters as well as 
the limited number of parameters.


Regards,

--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-08-02 Thread Mike Fowler

On 02/08/10 07:46, Pavel Stehule wrote:



I have not any suggestions now - so I'll change flag to ready to commit


sorry - contrib module should be a fixed

patch attached



Thanks Pavel, you saved me some time!

Regards,

--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Initial review of xslt with no limits patch

2010-08-01 Thread Mike Fowler

Hi Pavel,

Currently your patch isn't applying to head, from the looks of things a 
function signature has changed. Can you update your patch please?


Also, having had a read through the patch itself I note that there are 
no tests and no changes to documentation. Shouldn't the documentation 
advertise that the there are no limits on the numbers of parameters? A 
couple of tests will also help me to test your patch,


Below is the results of running patch:

patch -p0  ../nolimits.diff
patching file ./contrib/xml2/xslt_proc.c
Hunk #1 FAILED at 42.
Hunk #2 succeeded at 57 (offset -2 lines).
Hunk #3 succeeded at 69 (offset -2 lines).
Hunk #4 succeeded at 142 (offset -4 lines).
Hunk #5 succeeded at 179 (offset -4 lines).
Hunk #6 succeeded at 192 with fuzz 1 (offset -4 lines).
1 out of 6 hunks FAILED -- saving rejects to file 
./contrib/xml2/xslt_proc.c.rej


The rejects were:


*** ./contrib/xml2/xslt_proc.c.orig 2010-03-03 20:10:22.0 +0100
--- ./contrib/xml2/xslt_proc.c  2010-05-03 15:07:17.010918303 +0200
***
*** 42,50 
  extern void pgxml_parser_init(void);

  /* local defs */
! static void parse_params(const char **params, text *paramstr);

! #define MAXPARAMS 20  /* must be even, see 
parse_params() */


  #endif /* USE_LIBXSLT */

--- 42,51 
  extern void pgxml_parser_init(void);

  /* local defs */
! const char **parse_params(text *paramstr);

! #define INIT_PARAMS 20/* must be even, see 
parse_params() */
! #define EXTEND_PARAMS 20  /* must be even, see 
parse_params() */


  #endif /* USE_LIBXSLT */


Regards,
--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-07-30 Thread Mike Fowler

Hi Pavel,

Thanks for taking the time to review my patch. Attached is a new version 
addressing your concerns.


On 29/07/10 14:21, Pavel Stehule wrote:

I have a few issues:
* broken regress test (fedora 13 - xmllint: using libxml version 20707)

postgres=# SELECT xml_is_well_formed('pg:foo
xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo');
  xml_is_well_formed

  f
(1 row)

this xml is broken - but in regress tests is ok

[pa...@pavel-stehule ~]$ xmllint xxx
xxx:1: parser error : error parsing attribute name
pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo
   


This is a problem that was observered recently by Thom Brown - the 
commit fest webapp adds the semicolon after the quote. If you look at 
the attachment I sent in a email client you'll find there is no 
semicolon. The patch attached here will also not have the semicolon.



* xml_is_well_formed returns true for simple text

postgres=# SELECT xml_is_well_formed('');
  xml_is_well_formed

  t
(1 row)

it is probably wrong result - is it ok??
   


Yes this is OK, pure text is valid XML content.


* I don't understand to this fragment

PG_TRY();
+   {
+   size_t  count;
+   xmlChar*version = NULL;
+   int standalone = -1;
+.
+   res_code = parse_xml_decl(string,count,version,
NULL,standalone);
+   if (res_code != 0)
+   xml_ereport_by_code(ERROR, ERRCODE_INVALID_XML_CONTENT,
+ invalid XML
content: invalid XML declaration,
+   res_code);
+.
+   doc = xmlNewDoc(version);
+   doc-encoding = xmlStrdup((const xmlChar *) UTF-8);
+   doc-standalone = 1;

why? This function can raise exception when declaration is wrong. It
is wrong - I think, this function should returns false instead
exception.

   


You're quite right,  I should be returning false here and not causing an 
exception. I have corrected this in the attached patch.


Regards,

--
Mike Fowler
Registered Linux user: 379787

*** a/contrib/xml2/xpath.c
--- b/contrib/xml2/xpath.c
***
*** 27,33  PG_MODULE_MAGIC;
  
  /* externally accessible functions */
  
- Datum		xml_is_well_formed(PG_FUNCTION_ARGS);
  Datum		xml_encode_special_chars(PG_FUNCTION_ARGS);
  Datum		xpath_nodeset(PG_FUNCTION_ARGS);
  Datum		xpath_string(PG_FUNCTION_ARGS);
--- 27,32 
***
*** 70,97  pgxml_parser_init(void)
  	xmlLoadExtDtdDefaultValue = 1;
  }
  
- 
- /* Returns true if document is well-formed */
- 
- PG_FUNCTION_INFO_V1(xml_is_well_formed);
- 
- Datum
- xml_is_well_formed(PG_FUNCTION_ARGS)
- {
- 	text	   *t = PG_GETARG_TEXT_P(0);		/* document buffer */
- 	int32		docsize = VARSIZE(t) - VARHDRSZ;
- 	xmlDocPtr	doctree;
- 
- 	pgxml_parser_init();
- 
- 	doctree = xmlParseMemory((char *) VARDATA(t), docsize);
- 	if (doctree == NULL)
- 		PG_RETURN_BOOL(false);	/* i.e. not well-formed */
- 	xmlFreeDoc(doctree);
- 	PG_RETURN_BOOL(true);
- }
- 
- 
  /* Encodes special characters (, , ,  and \r) as XML entities */
  
  PG_FUNCTION_INFO_V1(xml_encode_special_chars);
--- 69,74 
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
 sect3
! titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8566 
  ]]/screen
  /para
 /sect3
+   /sect2
+ 
+   sect2
+titleXML Predicates/title
  
 sect3
! titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8578,8675 
   between documents and content fragments.
  /para
 /sect3
+ 
+sect3
+ titlexml_is_well_formed/title
+ 
+ indexterm
+  primaryxml_is_well_formed/primary
+  secondarywell formed/secondary
+ /indexterm
+ 
+ synopsis
+ functionxml_is_well_formed/function(replaceabletext/replaceable)
+ /synopsis
+ 
+ para
+  The function functionxml_is_well_formed/function evaluates whether
+  the replaceabletext/replaceable is well formed XML content, returning
+  a boolean.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  t
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobarstuff/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ ]]/screen
+ /para
+ para
+ In addition to the structure checks, the function ensures that namespaces are correcty matched.
+ screen![CDATA[
+ SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff

Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-07-24 Thread Mike Fowler

On 21/07/10 08:33, Mike Fowler wrote:


Why is the first argument AexprConst instead of a_expr? The SQL
standard says it's a character string literal, but I think we can very
well allow arbitrary expressions.


Yes, it was AexprConst because of the specification. I also found that
using it solved my shift/reduce problems, but I can change it a_expr as
see if I can work them out in a different way.


[snip]


Why c_expr?


As with the AexprConst, it's choice was partially influenced by the fact
it solved the shift/reduce errors I was getting. I'm guessing than that
I should really use a_expr and resolve the shift/reduce problem
differently?



Attached is the revised version of the patch addressing all the issues 
raised in the review, except for the use of AexprConst and c_expr. With 
my limited knowledge of bison I've failed to resolve the shift/reduce 
errors that are introduced by using a_expr. I'm open to suggestions as 
my desk is getting annoyed with me beating it in frustration!


Thanks again for taking the time to review my work.

Regards,

--
Mike Fowler
Registered Linux user: 379787
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
!sect3
  titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8570 
  ]]/screen
  /para
 /sect3
+/sect2
  
!sect2
  titleXML Predicates/title
+ 
+ indexterm
+  primaryXML Predicates/primary
+ /indexterm
+  
+sect3
+ titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8582,8619 
   between documents and content fragments.
  /para
 /sect3
+
+sect3
+ titleXMLEXISTS/title
+ 
+ indexterm
+  primaryXMLEXISTS/primary
+ /indexterm
+ 
+ synopsis
+ functionXMLEXISTS/function(replaceablexpath/replaceable optionalPASSING BY REF replaceablexml/replaceable optionalBY REF/optional/optional)
+ /synopsis
+ 
+ para
+  The function functionxmlexists/function returns true if the replaceablexpath/replaceable 
+  returns any nodes and false otherwise. If no replaceablexml/replaceable is passed, the function
+  will return false as a XPath cannot be evaluated without content. See the
+  ulink url=http://www.w3.org/TR/xpath/#section-Introduction;W3C recommendation 'XML Path Language'/ulink
+  for a detailed explanation of why.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownToronto/towntownOttawa/town/towns');
+ 
+  xmlexists
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-processing
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 423,431  static TypeName *TableFuncTypeName(List *columns);
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes
  %type node	xml_root_version opt_xml_root_standalone
! %type ival	document_or_content
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
--- 423,432 
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes xmlexists_list
  %type node	xml_root_version opt_xml_root_standalone
! %type node	xmlexists_query_argument_list
! %type ival	document_or_content 
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
***
*** 511,523  static TypeName *TableFuncTypeName(List *columns);
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
--- 512,524 
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
***
*** 539,545  static TypeName *TableFuncTypeName(List *columns);
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE

Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-07-21 Thread Mike Fowler

Hi Peter,

Thanks for your feedback.

On 20/07/10 19:54, Peter Eisentraut wrote:

Attached is a patch with the revised XMLEXISTS function, complete with
grammar support and regression tests. The implemented grammar is:

XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] )

Though the full grammar makes everything after the xpath_expression
optional, I've left it has mandatory simply to avoid lots of rework of
the function (would need new null checks, memory handling would need
reworking).

Some thoughts, mostly nitpicks:

The snippet of documentation could be clearer.  It says if the xml
satisifies the xpath.  Not sure what that means exactly.  An XPath
expression, by definition, returns a value.  How is that value used to
determine the result?
   


I'll rephrase it: The function xmlexists returns true if the xpath 
returns any nodes and false otherwise.



Naming of parser symbols: xmlexists_list isn't actually a list of
xmlexists's.  That particular rule can probably be done away with anyway
and the code be put directly into the XMLEXISTS rule.

Why is the first argument AexprConst instead of a_expr?  The SQL
standard says it's a character string literal, but I think we can very
well allow arbitrary expressions.
   


Yes, it was AexprConst because of the specification. I also found that 
using it solved my shift/reduce problems, but I can change it a_expr as 
see if I can work them out in a different way.



xmlexists_query_argument_list should be optional.
   


OK, I'll change it.


The rules xml_default_passing_mechanism and xml_passing_mechanism are
pretty useless to have a separate rules.  Just mention the tokens where
they are used.
   


Again, I'll change that too.


Why c_expr?
   


As with the AexprConst, it's choice was partially influenced by the fact 
it solved the shift/reduce errors I was getting. I'm guessing than that 
I should really use a_expr and resolve the shift/reduce problem differently?



Call the C-level function xmlexists for consistency.
   


Sure. I'll look to get a patch addressing these concerns out in the next 
day or two, work/family/sleep permitting! :)


Regards,

--
Mike Fowler
Registered Linux user: 379787



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

2010-07-12 Thread Mike Fowler

Thom Brown wrote:

Would a test for mismatched or undefined namespaces be necessary?

For example:

Mismatched namespace:
pg:foo xmlns:pg=http://postgresql.org/stuff;bar/my:foo

Undefined namespace when used in conjunction with IS DOCUMENT:
pg:foo xmlns:my=http://postgresql.org/stuff;bar/pg:foo
  


Thanks for looking at my patch Thom. I hadn't thought of that particular 
scenario and even though I didn't specifically code for it, the 
underlying libxml call does correctly reject the mismatched namespace:


template1=# SELECT xml_is_well_formed('pg:foo 
xmlns:pg=http://postgresql.org/stuff;bar/my:foo');
xml_is_well_formed

f
(1 row)



In the attached patch I've added the example to the SGML documentation 
and the regression tests.



Also, having a look at the following example from the patch:
SELECT xml_is_well_formed('local:data
xmlns:local=http://127.0.0.1;;local:piece id=1number
one/local:piecelocal:piece id=2 //local:data');
 xml_is_well_formed

 t
(1 row)

Just wondering about that semi-colon after the namespace definition.

Thom
  


The semi-colon is not supposed to be there, and I'm not sure where it's 
come from. With Thunderbird I see the email with my patch as an 
attachement, downloaded and viewing the file there are no instances of a 
 followed by a ;. However, if I look at the message on the archive at 
http://archives.postgresql.org/message-id/4c3871c2.8000...@mlfowler.com 
I can see every URL that ends with a  has  a ; following it. Should I 
be escaping the  in the patch file in some way or this just an artifact 
of HTML parsing a patch?


Regards,

--
Mike Fowler
Registered Linux user: 379787

*** a/contrib/xml2/xpath.c
--- b/contrib/xml2/xpath.c
***
*** 27,33  PG_MODULE_MAGIC;
  
  /* externally accessible functions */
  
- Datum		xml_is_well_formed(PG_FUNCTION_ARGS);
  Datum		xml_encode_special_chars(PG_FUNCTION_ARGS);
  Datum		xpath_nodeset(PG_FUNCTION_ARGS);
  Datum		xpath_string(PG_FUNCTION_ARGS);
--- 27,32 
***
*** 70,97  pgxml_parser_init(void)
  	xmlLoadExtDtdDefaultValue = 1;
  }
  
- 
- /* Returns true if document is well-formed */
- 
- PG_FUNCTION_INFO_V1(xml_is_well_formed);
- 
- Datum
- xml_is_well_formed(PG_FUNCTION_ARGS)
- {
- 	text	   *t = PG_GETARG_TEXT_P(0);		/* document buffer */
- 	int32		docsize = VARSIZE(t) - VARHDRSZ;
- 	xmlDocPtr	doctree;
- 
- 	pgxml_parser_init();
- 
- 	doctree = xmlParseMemory((char *) VARDATA(t), docsize);
- 	if (doctree == NULL)
- 		PG_RETURN_BOOL(false);	/* i.e. not well-formed */
- 	xmlFreeDoc(doctree);
- 	PG_RETURN_BOOL(true);
- }
- 
- 
  /* Encodes special characters (, , ,  and \r) as XML entities */
  
  PG_FUNCTION_INFO_V1(xml_encode_special_chars);
--- 69,74 
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
 sect3
! titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8566 
  ]]/screen
  /para
 /sect3
+   /sect2
+ 
+   sect2
+titleXML Predicates/title
  
 sect3
! titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8578,8675 
   between documents and content fragments.
  /para
 /sect3
+ 
+sect3
+ titlexml_is_well_formed/title
+ 
+ indexterm
+  primaryxml_is_well_formed/primary
+  secondarywell formed/secondary
+ /indexterm
+ 
+ synopsis
+ functionxml_is_well_formed/function(replaceabletext/replaceable)
+ /synopsis
+ 
+ para
+  The function functionxml_is_well_formed/function evaluates whether
+  the replaceabletext/replaceable is well formed XML content, returning
+  a boolean.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  t
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobarstuff/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ ]]/screen
+ /para
+ para
+ In addition to the structure checks, the function ensures that namespaces are correcty matched.
+ screen![CDATA[
+ SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;bar/my:foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ 
+ SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;bar/pg:foo');
+  xml_is_well_formed
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+ para
+ This function can be combined with the IS DOCUMENT predicate to prevent
+ invalid XML content errors from occuring in queries. For example, given a
+ table that may have rows with invalid XML mixed in with rows of valid

Re: [PATCH] Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

2010-07-10 Thread Mike Fowler

Robert Haas wrote:

On Fri, Jul 9, 2010 at 4:06 PM, Peter Eisentraut pete...@gmx.net wrote:
  

On ons, 2010-07-07 at 16:37 +0100, Mike Fowler wrote:


Here's the patch to add the 'xml_is_well_formed' function.
  

I suppose we should remove the function from contrib/xml2 at the same
time.



Yep


Revised patch deleting the contrib/xml2 version of the function attached.

Regards,

--
Mike Fowler
Registered Linux user: 379787

*** a/contrib/xml2/xpath.c
--- b/contrib/xml2/xpath.c
***
*** 27,33  PG_MODULE_MAGIC;
  
  /* externally accessible functions */
  
- Datum		xml_is_well_formed(PG_FUNCTION_ARGS);
  Datum		xml_encode_special_chars(PG_FUNCTION_ARGS);
  Datum		xpath_nodeset(PG_FUNCTION_ARGS);
  Datum		xpath_string(PG_FUNCTION_ARGS);
--- 27,32 
***
*** 70,97  pgxml_parser_init(void)
  	xmlLoadExtDtdDefaultValue = 1;
  }
  
- 
- /* Returns true if document is well-formed */
- 
- PG_FUNCTION_INFO_V1(xml_is_well_formed);
- 
- Datum
- xml_is_well_formed(PG_FUNCTION_ARGS)
- {
- 	text	   *t = PG_GETARG_TEXT_P(0);		/* document buffer */
- 	int32		docsize = VARSIZE(t) - VARHDRSZ;
- 	xmlDocPtr	doctree;
- 
- 	pgxml_parser_init();
- 
- 	doctree = xmlParseMemory((char *) VARDATA(t), docsize);
- 	if (doctree == NULL)
- 		PG_RETURN_BOOL(false);	/* i.e. not well-formed */
- 	xmlFreeDoc(doctree);
- 	PG_RETURN_BOOL(true);
- }
- 
- 
  /* Encodes special characters (, , ,  and \r) as XML entities */
  
  PG_FUNCTION_INFO_V1(xml_encode_special_chars);
--- 69,74 
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
 sect3
! titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8566 
  ]]/screen
  /para
 /sect3
+   /sect2
+ 
+   sect2
+titleXML Predicates/title
  
 sect3
! titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8578,8653 
   between documents and content fragments.
  /para
 /sect3
+ 
+sect3
+ titlexml_is_well_formed/title
+ 
+ indexterm
+  primaryxml_is_well_formed/primary
+  secondarywell formed/secondary
+ /indexterm
+ 
+ synopsis
+ functionxml_is_well_formed/function(replaceabletext/replaceable)
+ /synopsis
+ 
+ para
+  The function functionxml_is_well_formed/function evaluates whether
+  the replaceabletext/replaceable is well formed XML content, returning
+  a boolean.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  t
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ ]]/screen
+ /para
+ para
+ This function can be combined with the IS DOCUMENT predicate to prevent
+ invalid XML content errors from occuring in queries. For example, given a
+ table that may have rows with invalid XML mixed in with rows of valid
+ XML, functionxml_is_well_formed/function can be used to filter out all
+ the invalid rows.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SELECT * FROM mixed;
+  data
+ --
+  foobar/foo
+  foobar/foo
+  foobar/foobarfoo/bar
+  foobar/foobarfoo/bar
+ (4 rows)
+ 
+ SELECT COUNT(data) FROM mixed WHERE data::xml IS DOCUMENT;
+ ERROR:  invalid XML content
+ DETAIL:  Entity: line 1: parser error : expected ''
+ foobar/foo
+  ^
+ Entity: line 1: parser error : chunk is not well balanced
+ foobar/foo
+  ^
+ 
+ SELECT COUNT(data) FROM mixed WHERE xml_is_well_formed(data) AND data::xml IS DOCUMENT;
+  count
+ ---
+  1
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-processing
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3293,3298  xml_xmlnodetoxmltype(xmlNodePtr cur)
--- 3293,3365 
  }
  #endif
  
+ Datum
+ xml_is_well_formed(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text*data = PG_GETARG_TEXT_P(0);
+ 	boolresult;
+ 	int	res_code;
+ 	int32len;
+ 	const xmlChar		*string;
+ 	xmlParserCtxtPtr	ctxt;
+ 	xmlDocPtr			doc = NULL;
+ 
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	string = xml_text2xmlChar(data);
+ 
+ 	/* Start up libxml and its parser (no-ops if already done) */
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	ctxt = xmlNewParserCtxt();
+ 	if (ctxt == NULL)
+ 		xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 	could not allocate parser context);
+ 
+ 	PG_TRY();
+ 	{
+ 		size_t		count;
+ 		xmlChar*version = NULL;
+ 		int			standalone = -1;
+ 
+ 		res_code = parse_xml_decl(string, count, version, NULL, standalone);
+ 		if (res_code != 0)
+ 			xml_ereport_by_code(ERROR

[PATCH] Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

2010-07-07 Thread Mike Fowler

Peter Eisentraut wrote:

On lör, 2010-07-03 at 09:26 +0100, Mike Fowler wrote:
  
What I will do  
instead is implement the xml_is_well_formed function and get a patch  
out in the next day or two. 



That sounds very useful.
  
Here's the patch to add the 'xml_is_well_formed' function. Paraphrasing 
the SGML the syntax is:


|xml_is_well_formed|(/text/)

The function |xml_is_well_formed| evaluates whether the /text/ is well 
formed XML content, returning a boolean. I've done some tests (included 
in the patch) with tables containing a mixture of well formed documents 
and content and the function is happily returning the expected result. 
Combining with IS (NOT) DOCUMENT is working nicely for pulling out 
content or documents from a table of text.


Unless I missed something in the original correspondence, I think this 
patch will solve the issue.


Regards,

--
Mike Fowler
Registered Linux user: 379787

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
 sect3
! titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8566 
  ]]/screen
  /para
 /sect3
+   /sect2
+ 
+   sect2
+titleXML Predicates/title
  
 sect3
! titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8578,8653 
   between documents and content fragments.
  /para
 /sect3
+ 
+sect3
+ titlexml_is_well_formed/title
+ 
+ indexterm
+  primaryxml_is_well_formed/primary
+  secondarywell formed/secondary
+ /indexterm
+ 
+ synopsis
+ functionxml_is_well_formed/function(replaceabletext/replaceable)
+ /synopsis
+ 
+ para
+  The function functionxml_is_well_formed/function evaluates whether
+  the replaceabletext/replaceable is well formed XML content, returning
+  a boolean.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  t
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ ]]/screen
+ /para
+ para
+ This function can be combined with the IS DOCUMENT predicate to prevent
+ invalid XML content errors from occuring in queries. For example, given a
+ table that may have rows with invalid XML mixed in with rows of valid
+ XML, functionxml_is_well_formed/function can be used to filter out all
+ the invalid rows.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SELECT * FROM mixed;
+  data
+ --
+  foobar/foo
+  foobar/foo
+  foobar/foobarfoo/bar
+  foobar/foobarfoo/bar
+ (4 rows)
+ 
+ SELECT COUNT(data) FROM mixed WHERE data::xml IS DOCUMENT;
+ ERROR:  invalid XML content
+ DETAIL:  Entity: line 1: parser error : expected ''
+ foobar/foo
+  ^
+ Entity: line 1: parser error : chunk is not well balanced
+ foobar/foo
+  ^
+ 
+ SELECT COUNT(data) FROM mixed WHERE xml_is_well_formed(data) AND data::xml IS DOCUMENT;
+  count
+ ---
+  1
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-processing
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3293,3298  xml_xmlnodetoxmltype(xmlNodePtr cur)
--- 3293,3365 
  }
  #endif
  
+ Datum
+ xml_is_well_formed(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text*data = PG_GETARG_TEXT_P(0);
+ 	boolresult;
+ 	int	res_code;
+ 	int32len;
+ 	const xmlChar		*string;
+ 	xmlParserCtxtPtr	ctxt;
+ 	xmlDocPtr			doc = NULL;
+ 
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	string = xml_text2xmlChar(data);
+ 
+ 	/* Start up libxml and its parser (no-ops if already done) */
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	ctxt = xmlNewParserCtxt();
+ 	if (ctxt == NULL)
+ 		xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 	could not allocate parser context);
+ 
+ 	PG_TRY();
+ 	{
+ 		size_t		count;
+ 		xmlChar*version = NULL;
+ 		int			standalone = -1;
+ 
+ 		res_code = parse_xml_decl(string, count, version, NULL, standalone);
+ 		if (res_code != 0)
+ 			xml_ereport_by_code(ERROR, ERRCODE_INVALID_XML_CONTENT,
+ 		  invalid XML content: invalid XML declaration,
+ 			res_code);
+ 
+ 		doc = xmlNewDoc(version);
+ 		doc-encoding = xmlStrdup((const xmlChar *) UTF-8);
+ 		doc-standalone = 1;
+ 
+ 		res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0, string + count, NULL);
+ 
+ 		result = !res_code;
+ 	}
+ 	PG_CATCH();
+ 	{
+ 		if (doc)
+ 			xmlFreeDoc(doc);
+ 		if (ctxt)
+ 			xmlFreeParserCtxt(ctxt);
+ 
+ 		PG_RE_THROW();
+ 	}
+ 	PG_END_TRY();
+ 
+ 	if (doc)
+ 		xmlFreeDoc(doc);
+ 	if (ctxt)
+ 		xmlFreeParserCtxt(ctxt);
+ 
+ 	return result;
+ #else
+ 	NO_XML_SUPPORT();
+ 	return 0

Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

2010-07-03 Thread Mike Fowler

Quoting Peter Eisentraut pete...@gmx.net:


On fre, 2010-07-02 at 14:07 +0100, Mike Fowler wrote:

So if IS CONTENT were
to be implemented, to determine that you have something that is
malformed


But that's not what IS CONTENT does.  Content still needs to be
well-formed.



What I was hoping to achieve was to determine that something wasn't a  
document and wasn't content, however as you pointed out on the bugs  
thread the value must be XML. My mistake was not checking that I had  
followed the definitions all the way back to the root. What I will do  
instead is implement the xml_is_well_formed function and get a patch  
out in the next day or two.


Thank you Robert and Peter for tolerating my stumbles through the standard.

Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

2010-07-02 Thread Mike Fowler

Quoting Robert Haas robertmh...@gmail.com:



I think the point if IS DOCUMENT is to distinguish a document:

foosome stuffbar/baz//foo

from a document fragment:

bar/baz/

A document is allowed only one toplevel tag.

It'd be nice, I think, to have a function that tells you whether
something is legal XML without throwing an error if it isn't, but I
suspect that should be a separate function, rather than trying to jam
it into IS DOCUMENT.

http://developer.postgresql.org/pgdocs/postgres/functions-xml.html#AEN15187



I've submitted a patch to the bug report I filed yesterday that  
implements this. The way I read the standard (and I'm only reading a  
draft and I'm no expert) I don't see that it mandates that IS DOCUMENT  
returns false when IS CONTENT would return true. So if IS CONTENT were  
to be implemented, to determine that you have something that is  
malformed you could say:


val IS NOT DOCUMENT AND val IS NOT CONTENT

I think having the direct predicate support would be useful for  
columns of text where you know that some, though possibly not all,  
text values are valid XML.


--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

2010-07-01 Thread Mike Fowler

Quoting Mike Fowler m...@mlfowler.com:


Should the IS DOCUMENT predicate support this? At the moment you get
the following:

template1=# SELECT
'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'   
IS

DOCUMENT;
?column?
--
t
(1 row)

template1=# SELECT
'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'   
IS

DOCUMENT;
ERROR:  invalid XML content
LINE 1: SELECT 'townstownBidford-on-Avon/towntownCwmbran/to...
  ^
DETAIL:  Entity: line 1: parser error : expected ''
ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns

  ^
Entity: line 1: parser error : chunk is not well balanced
ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns

  ^
I would've hoped the second would've returned 'f' rather than failing.
I've had a glance at the XML/SQL standard and I don't see anything in
the detail of the predicate (8.2) that would specifically prohibit us
from changing this behavior, unless the common rule  'Parsing a string
as an XML value' (10.16) must always be in force. I'm no standard
expert, but IMHO this would be an acceptable change to improve
usability. What do others think?


Right, I've answered my own question whilst sitting in the open source  
coding session at CHAR(10). Yes, IS DOCUMENT should return false for a  
non-well formed document, and indeed is coded to do such. However, the  
conversion to the xml type which happens before the underlying  
xml_is_document function is even called fails and exceptions out. I'll  
work on a patch to resolve this behavior such that IS DOCUMENT will  
give you the missing 'xml_is_well_formed' function.


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-06-29 Thread Mike Fowler

Mike Fowler wrote:

Bruce Momjian wrote:

I have added this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view?id=6   
Thanks Bruce. Attached is a revised patch which changes the code 
slightly such that it uses an older version of the libxml library. 
I've added comments to the code so that we remember why we didn't use 
the latest function.


After seeing some other posts in the last couple of days, I realised I 
hadn't documented the function in the SGML. I have now done so, and 
added a couple of tests with XML literals. Please find the patch 
attached. Now time to go correct the xmlexists patch too...


--
Mike Fowler
Registered Linux user: 379787

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8626,8631  SELECT xpath('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a',
--- 8626,8664 
  (1 row)
  ]]/screen
 /para
+
+sect3
+ titlexpath_exists/title
+ 
+ indexterm
+  primaryxpath_exists/primary
+ /indexterm
+ 
+ synopsis
+  functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional)
+ /synopsis
+ 
+ para
+  The function functionxpath_exists/function is a specialised form
+  of the functionxpath/function function. Though the functions are
+  syntactically the same the xpath expressions are evaluated in differing
+  contexts. Instead of returning the XML values that satisfy the xpath, this
+  function returns a boolean indicating whether the query was satisfied or not.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', 
+ ARRAY[ARRAY['my', 'http://example.com']]);
+ 
+  xpath_exists  
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-mapping
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3495,3497  xpath(PG_FUNCTION_ARGS)
--- 3495,3681 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	ArrayBuildState *astate = NULL;
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	xmlXPathObjectPtr xpathobj = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			res_nitems;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two

Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-29 Thread Mike Fowler
Mike Fowler wrote:  
Thanks again for your help Robert, turns out the fault was in the 
pg_proc entry (the 3 up there should've been a two!). Once I took the 
grammar out it was quickly obvious where I'd gone wrong.


Attached is a patch with the revised XMLEXISTS function, complete with 
grammar support and regression tests. The implemented grammar is:


XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] )

Though the full grammar makes everything after the xpath_expression 
optional, I've left it has mandatory simply to avoid lots of rework of 
the function (would need new null checks, memory handling would need 
reworking).




As with the xpath_exists patch I've now added the SGML documentation 
detailing this function and extended the regression test a little to 
test XML literals.


Regards,

--
Mike Fowler
Registered Linux user: 379787

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
!sect3
  titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8570 
  ]]/screen
  /para
 /sect3
+/sect2
  
!sect2
  titleXML Predicates/title
+ 
+ indexterm
+  primaryXML Predicates/primary
+ /indexterm
+  
+sect3
+ titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8582,8616 
   between documents and content fragments.
  /para
 /sect3
+
+sect3
+ titleXMLEXISTS/title
+ 
+ indexterm
+  primaryXMLEXISTS/primary
+ /indexterm
+ 
+ synopsis
+ functionXMLEXISTS/function(replaceablexpath/replaceable PASSING BY REF replaceablexml/replaceable optionalBY REF/optional)
+ /synopsis
+ 
+ para
+  The function functionxmlexists/function returns true if the replaceablexml/replaceable 
+  satisfies the replaceablexpath/replaceable and false otherwise.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownToronto/towntownOttawa/town/towns');
+ 
+  xmlexists
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-processing
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 423,431  static TypeName *TableFuncTypeName(List *columns);
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes
  %type node	xml_root_version opt_xml_root_standalone
! %type ival	document_or_content
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
--- 423,432 
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes xmlexists_list
  %type node	xml_root_version opt_xml_root_standalone
! %type node	xmlexists_query_argument_list xml_default_passing_mechanism xml_passing_mechanism
! %type ival	document_or_content 
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
***
*** 511,523  static TypeName *TableFuncTypeName(List *columns);
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
--- 512,524 
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
***
*** 539,545  static TypeName *TableFuncTypeName(List *columns);
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
  
! 	XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLFOREST XMLPARSE
  	XMLPI XMLROOT XMLSERIALIZE
  
  	YEAR_P YES_P
--- 540,546 
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
  
! 	XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLPARSE
  	XMLPI XMLROOT XMLSERIALIZE
  
  	YEAR_P YES_P
***
*** 9806,9811  func_expr:	func_name '(' ')' over_clause
--- 9807,9828 
  {
  	$$ = makeXmlExpr(IS_XMLELEMENT, $4, $6, $8

Re: [HACKERS] Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

2010-06-28 Thread Mike Fowler

Robert Haas wrote:

On Mon, Jun 28, 2010 at 11:42 AM, Mike Rylander mrylan...@gmail.com wrote:
  

You could do something like this (untested):

CREATE OR REPLACE FUNCTION my_xml_is_valid ( x TEXT ) RETURNS BOOL AS $$
BEGIN
 PERFORM XMLPARSE( DOCUMENT x::XML );
 RETURN TRUE;
EXCEPTION WHEN OTHERS THEN
 RETURN FALSE;
END;
$$ LANGUAGE PLPGSQL;



This might perform significantly worse, though: exception handling ain't cheap.

It's not a bad workaround, but I think the OP has a point.

  
Should the IS DOCUMENT predicate support this? At the moment you get the 
following:


template1=# SELECT 
'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns' 
IS DOCUMENT;

?column?
--
t
(1 row)

template1=# SELECT 
'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns' 
IS DOCUMENT;

ERROR:  invalid XML content
LINE 1: SELECT 'townstownBidford-on-Avon/towntownCwmbran/to...
  ^
DETAIL:  Entity: line 1: parser error : expected ''
ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns
  
^

Entity: line 1: parser error : chunk is not well balanced
ownstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns
  
^
I would've hoped the second would've returned 'f' rather than failing. 
I've had a glance at the XML/SQL standard and I don't see anything in 
the detail of the predicate (8.2) that would specifically prohibit us 
from changing this behavior, unless the common rule  'Parsing a string 
as an XML value' (10.16) must always be in force. I'm no standard 
expert, but IMHO this would be an acceptable change to improve 
usability. What do others think?


Regards,

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-27 Thread Mike Fowler



and finally in pg_proc.h I have:

DATA(insert OID = 3037 (  xmlexists PGNSP PGUID 12 1 0 0 f f f t f i 3 0
16 25 142 _null_ _null_ _null_ _null_ xml_exists _null_ _null_ _null_ ));
DESCR(evaluate XPath expression in a boolean context);


It looks like the pg_proc entry is creating an SQL function called
xmlexists referencing a C function called xml_exists, and the gram.y
changes want there to be an SQL function called xml_exists.  I think
you should rip out all the catalog and parser changes for starters,
and just try to get it working as a regular old function.  Once you
have that working, you can add the syntax support back in.  I'd
suggest making the C and SQL function names the same as each other,
but different from the keyword you're planning to use (xmlexists).
  
Thanks again for your help Robert, turns out the fault was in the 
pg_proc entry (the 3 up there should've been a two!). Once I took the 
grammar out it was quickly obvious where I'd gone wrong.


Attached is a patch with the revised XMLEXISTS function, complete with 
grammar support and regression tests. The implemented grammar is:


XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] )

Though the full grammar makes everything after the xpath_expression 
optional, I've left it has mandatory simply to avoid lots of rework of 
the function (would need new null checks, memory handling would need 
reworking).


--
Mike Fowler
Registered Linux user: 379787

*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 423,431  static TypeName *TableFuncTypeName(List *columns);
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes
  %type node	xml_root_version opt_xml_root_standalone
! %type ival	document_or_content
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
--- 423,432 
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes xmlexists_list
  %type node	xml_root_version opt_xml_root_standalone
! %type node	xmlexists_query_argument_list xml_default_passing_mechanism xml_passing_mechanism
! %type ival	document_or_content 
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
***
*** 511,523  static TypeName *TableFuncTypeName(List *columns);
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
--- 512,524 
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
***
*** 539,545  static TypeName *TableFuncTypeName(List *columns);
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
  
! 	XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLFOREST XMLPARSE
  	XMLPI XMLROOT XMLSERIALIZE
  
  	YEAR_P YES_P
--- 540,546 
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
  
! 	XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLPARSE
  	XMLPI XMLROOT XMLSERIALIZE
  
  	YEAR_P YES_P
***
*** 9806,9811  func_expr:	func_name '(' ')' over_clause
--- 9807,9828 
  {
  	$$ = makeXmlExpr(IS_XMLELEMENT, $4, $6, $8, @1);
  }
+ 			| XMLEXISTS '(' xmlexists_list ')'
+ {
+ 	/* xmlexists(A [PASSING BY REF B [BY REF]]) is converted to
+ 	 * xmlexists(A, B)*/
+ 	 
+ 	FuncCall *n = makeNode(FuncCall);
+ 	n-funcname = SystemFuncName(xmlexists);
+ 	n-args = $3;
+ 	n-agg_order = NIL;
+ 	n-agg_star = FALSE;
+ 	n-agg_distinct = FALSE;
+ 	n-func_variadic = FALSE;
+ 	n-over = NULL;
+ 	n-location = @1;
+ 	$$ = (Node *)n;
+ }
  			| XMLFOREST '(' xml_attribute_list ')'
  {
  	$$ = makeXmlExpr(IS_XMLFOREST, NULL, $3, NIL, @1);
***
*** 9896,9901  xml_whitespace_option: PRESERVE WHITESPACE_P		{ $$ = TRUE; }
--- 9913,9946 
  			| /*EMPTY*/{ $$ = FALSE; }
  		;
  
+ xmlexists_list:
+ 			AexprConst xmlexists_query_argument_list
+ {
+ 	$$ = list_make2(makeTypeCast($1,SystemTypeName(text), -1), $2);
+ }
+ 		;
+ 
+ xmlexists_query_argument_list

[PATCH] Re: [HACKERS] Adding xpath_exists function

2010-06-27 Thread Mike Fowler

Bruce Momjian wrote:

I have added this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view?id=6

  
Thanks Bruce. Attached is a revised patch which changes the code 
slightly such that it uses an older version of the libxml library. I've 
added comments to the code so that we remember why we didn't use the 
latest function.


Regards,

--
Mike Fowler
Registered Linux user: 379787



*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3495,3497  xpath(PG_FUNCTION_ARGS)
--- 3495,3681 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	ArrayBuildState *astate = NULL;
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	xmlXPathObjectPtr xpathobj = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			res_nitems;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate parser context);
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		could not parse XML document);
+ 		xpathctx = xmlXPathNewContext(doc);
+ 		if (xpathctx == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate XPath context);
+ 		xpathctx-node = xmlDocGetRootElement(doc);
+ 		if (xpathctx-node == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+ 		could not find root XML element);
+ 
+ 		/* register namespaces, if any */
+ 		if (ns_count  0)
+ 		{
+ 			for (i = 0; i  ns_count; i++)
+ 			{
+ char	   *ns_name;
+ char	   *ns_uri;
+ 
+ if (ns_names_uris_nulls[i * 2] ||
+ 	ns_names_uris_nulls[i * 2 + 1])
+ 	ereport(ERROR,
+ 			(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ 	  errmsg(neither namespace name nor URI may be null)));
+ ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+ ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
+ if (xmlXPathRegisterNs(xpathctx,
+ 	   (xmlChar *) ns_name,
+ 	   (xmlChar *) ns_uri) != 0)
+ 	ereport(ERROR,		/* is this an internal error??? */
+ 			(errmsg(could not register XML namespace with name \%s\ and URI \%s\,
+ 	ns_name, ns_uri)));
+ 			}
+ 		}
+ 
+ 		xpathcomp

Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-27 Thread Mike Fowler

Robert Haas wrote:

On Sun, Jun 27, 2010 at 12:04 PM, Mike Fowler m...@mlfowler.com wrote:
  

Thanks again for your help Robert, turns out the fault was in the pg_proc
entry (the 3 up there should've been a two!). Once I took the grammar out it
was quickly obvious where I'd gone wrong.



Glad it was a helpful suggestion.

  

Attached is a patch with the revised XMLEXISTS function, complete with
grammar support and regression tests. The implemented grammar is:

XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] )

Though the full grammar makes everything after the xpath_expression
optional, I've left it has mandatory simply to avoid lots of rework of the
function (would need new null checks, memory handling would need reworking).



So if you don't specify the xml_value, what does the xpath_expression
get applied to?
  
From what I can gather the xpath_expression would be evalutated against 
an empty document thereby returning false for every xpath_expression 
except for 'true()'. Apache Derby has made the xml_value mandatory as 
well (though I'll stress my conclusion wasn't based on this fact). If 
you think it would better to adhere more closely to the standard I can 
certainly look to do so. From a cursory glance at libxml's API I think 
it should be straight forward to query against an empty document such 
that I wouldn't need ot code for the exceptional case (or cases if I've 
missed others).


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-24 Thread Mike Fowler

Robert Haas wrote:


I usually troubleshoot things like this by setting a breakpoint in
elog_start or elog_finish.  Then you can see where it's blowing up.
Off the top of my head, I would guess you've added a node type whose
structure definition doesn't begin with NodeTag, or else you've got a
memory clobber.


Thanks Robert, I've managed to resolve this make making a type cast 
inside gram.y. However, it now seems that the function itself can not be 
found. I've made an entry in pg_proc.h, but when running psql I'm 
getting the following:


xmltest=# SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers' 
PASSING BY REF data);

ERROR:  function pg_catalog.xml_exists(text, xml) does not exist
LINE 1: SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers' ...
   ^
HINT:  No function matches the given name and argument types. You might 
need to add explicit type casts.


In gram.y I've got:

FuncCall *n = makeNode(FuncCall);
n-funcname = SystemFuncName(xml_exists);

(also tried SystemFuncName(xmlexists);)

In xml.h:

extern bool xml_exists(text *xpath_expr_text, xmltype *data);

I've also tried

bool xml_exists(PG_FUNCTION_ARGS) {

and finally in pg_proc.h I have:

DATA(insert OID = 3037 (  xmlexists PGNSP PGUID 12 1 0 0 f f f t f i 
3 0 16 25 142 _null_ _null_ _null_ _null_ xml_exists _null_ _null_ 
_null_ ));

DESCR(evaluate XPath expression in a boolean context);

(also tried ( xml_exists   PGNSP))

After each attempt, I've blown away the installation, made clean and 
installed, initialised a fresh database and restored my sample database. 
I've had a grep around using position and it's target function textpos 
as examples but I fail to see any other file that they live in other 
than their implementation. As far as I can tell, I'm not doing anything 
different from position. Any thoughts?


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-22 Thread Mike Fowler



Look at how the POSITION() pseudofunction is defined around gram.y
line 9651.  Essentially any special syntax of this type gets converted
to a regular function call internally.  So in your case I think there
will be some function that gets called something ike this:

xmlexists(xpath_expression, xml_expression)

...but the grammar can be modified to allow a different syntax for
that function call.
  
I've finally managed to get gram.y to parse the syntax correctly. After 
progressing from a segmentation fault that occured when the grammar was 
correct I'm now left with a cryptic error:


xmltest=# SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers' 
PASSING BY REF data);

ERROR:  unrecognized node type: 1852140847

At a guess there is another step that I need to do after modifying 
gram.y. One mailing list posting I found mentioned copyfuncs.c but 
really I'm unsure as to what next. Anyone know what the missing step is?


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding XMLEXISTS to the grammar

2010-06-09 Thread Mike Fowler

Hi,

I've been working to improve the syntax of the XMLEXISTS function that I 
put a patch forward for and have been attempting to get my head around 
how you modify the grammar. I admit I'm not getting much anywhere 
probably as I don't know bison but I'm starting to wonder if it's worth 
the pain given recent comments on this list about not changing the 
grammar for JSON support. At this point I can see a way of implementing 
the following abridged syntax (abridged as I'm not doing full XQuery 
support at this stage) in a conventional plain function call by handling 
the PG_FUNCTION_ARGS approriately, but would this acceptable?


XMLEXISTS
(
xpath_expression
 [
  PASSING BY REF xml_expression [BY REF]
 ]
)

In case it isn't, and indeed to help me with the XML schema validation 
work I'm doing, I would still like some help on how the grammar works. 
From what I've greped and seen in the comments you need to modify the 
following files:


- src/backend/parser/gram.y
- src/backend/parser/parse_expr.c
- src/backend/utils/ruleutils.c
- src/include/parser/kwlist.h

From what I can tell, you add the keywords to the lists in gram.y and 
kwlist.h. At the appropriate place in gram.y you define the syntax and 
pull out what you need and stuff it into a node (in my case using the 
makeXmlExpr). You then modify parse_expr.c and ruleutils.c to handle the 
new values in the fields of the XmlExpr node. Assuming I'm right so far, 
the step I'm failing to figure out is where the actual c function that 
implements the function gets called/associated within the grammar. What 
am I missing?


Thanks in advance,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-27 Thread Mike Fowler

Peter Eisentraut wrote:

On ons, 2010-05-26 at 11:47 +0100, Mike Fowler wrote:
  

The XMLEXISTS function works with XQuery expressions and doesn't have
the call signature that your patch implements
  
Looking at the manuals of Oracle, Derby and DB2 I see how the call 
signature differs. I also note that Oracle's implementation is XPath 
only, Derby's is partial XQuery and DB2 appears to be full XQuery.


What do people prefer me to do? I see the options as:

1) Change the call signature to match the standard
2) Change the function name back to xpath_exists



It would be nice to make XMLEXISTS work as in the standard, seeing how
many others are providing the same interface.

  

Should option one be the more popular there's further choices:

1) Integrate XQuery support to completely match the standard, however 
this will require the addition of a new library libxquery

2) Leave the XPath as is, inline with Oracle's implementation
3) Hybrid approach. Since XML is a comple time option, add XQuery as 
another. Conditional completion gives the full XQuery support when 
available or just the XPath when not



I think providing XPath is enough, at least for now


Agreed. I'll get another patch together in the next day or two.

Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-26 Thread Mike Fowler

Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:
  

On Tue, May 25, 2010 at 1:09 PM, Mike Fowler m...@mlfowler.com wrote:


We're unlikely to accept this patch if it changes the minimum version
of libxml2 required to compile PostgreSQL


Why? 2.6.27 is almost 4 years old.

At a minimum, I think it's fair to say that the burden is on you to
justify what it's worth bumping the version number.



Yes.  Increasing the minimum required version of some library is a Big
Deal, we don't do it on a whim.  And we definitely don't do it just
because it's old.

regards, tom lane

  


OK, I consider myself suitably educated/chastised. I now understand why 
a version bump is such a big deal. Your objections are all reasonable, I 
suppose I'm just used to living on the bleeding edge of everything. 
Consequently I have changed the code to produce the same result in a 
different way without using the new function. I've down-graded my 
version to 2.6.26 and it all compiles cleanly. Please find attached my 
revised patch, and thanks all for your advise.


Regards,

--
Mike Fowler
Registered Linux user: 379787

Index: src/backend/utils/adt/xml.c
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.97
diff -c -r1.97 xml.c
*** src/backend/utils/adt/xml.c	3 Mar 2010 17:29:45 -	1.97
--- src/backend/utils/adt/xml.c	26 May 2010 09:36:50 -
***
*** 3495,3497 
--- 3495,3678 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xmlexists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	xmlXPathObjectPtr xpathobj = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate parser context);
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		could not parse XML document);
+ 		xpathctx = xmlXPathNewContext(doc);
+ 		if (xpathctx == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate XPath context);
+ 		xpathctx-node = xmlDocGetRootElement(doc

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-26 Thread Mike Fowler

Peter Eisentraut wrote:

On tis, 2010-05-25 at 15:31 +0100, Mike Fowler wrote:
  
I've been reading the SQL/XML standard and discovered that it defines a 
function named XMLEXISTS that does exactly what the todo item 
xpath_exists defines. My original patch named the function as per the 
todo but I think using the function name from the standard is a better 
idea. So this patch is the same as before, but the function is now named 
XMLEXISTS instead of xpath_exists.



The XMLEXISTS function works with XQuery expressions and doesn't have
the call signature that your patch implements


Looking at the manuals of Oracle, Derby and DB2 I see how the call 
signature differs. I also note that Oracle's implementation is XPath 
only, Derby's is partial XQuery and DB2 appears to be full XQuery.


What do people prefer me to do? I see the options as:

1) Change the call signature to match the standard
2) Change the function name back to xpath_exists

Should option one be the more popular there's further choices:

1) Integrate XQuery support to completely match the standard, however 
this will require the addition of a new library libxquery

2) Leave the XPath as is, inline with Oracle's implementation
3) Hybrid approach. Since XML is a comple time option, add XQuery as 
another. Conditional completion gives the full XQuery support when 
available or just the XPath when not


Thoughts?

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard (was: Add xpath_exists Function)

2010-05-25 Thread Mike Fowler
I've been reading the SQL/XML standard and discovered that it defines a 
function named XMLEXISTS that does exactly what the todo item 
xpath_exists defines. My original patch named the function as per the 
todo but I think using the function name from the standard is a better 
idea. So this patch is the same as before, but the function is now named 
XMLEXISTS instead of xpath_exists.


Regards,

--
Mike Fowler
Registered Linux user: 379787

Index: src/backend/utils/adt/xml.c
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.97
diff -c -r1.97 xml.c
*** src/backend/utils/adt/xml.c	3 Mar 2010 17:29:45 -	1.97
--- src/backend/utils/adt/xml.c	25 May 2010 14:02:33 -
***
*** 3495,3497 
--- 3495,3668 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xmlexists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate parser context);
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		could not parse XML document);
+ 		xpathctx = xmlXPathNewContext(doc);
+ 		if (xpathctx == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate XPath context);
+ 		xpathctx-node = xmlDocGetRootElement(doc);
+ 		if (xpathctx-node == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+ 		could not find root XML element);
+ 
+ 		/* register namespaces, if any */
+ 		if (ns_count  0)
+ 		{
+ 			for (i = 0; i  ns_count; i++)
+ 			{
+ char	   *ns_name;
+ char	   *ns_uri;
+ 
+ if (ns_names_uris_nulls[i * 2] ||
+ 	ns_names_uris_nulls[i * 2 + 1])
+ 	ereport(ERROR,
+ 			(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ 	  errmsg(neither namespace name nor URI may be null)));
+ ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+ ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
+ if (xmlXPathRegisterNs(xpathctx,
+ 	   (xmlChar *) ns_name,
+ 	   (xmlChar *) ns_uri) != 0)
+ 	ereport(ERROR

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Mike Fowler

Erik Rijkers wrote:

libxml2.x86_64  2.6.26-2.1.2.8  installed
libxml2-devel.x86_642.6.26-2.1.2.8  installed
  


Thanks for testing my patch Erik. It turns out I've got libxml2 
installed at version 2.7.5. Searching the gnome mailing lists, it turns 
out xmlXPathCompiledEvalToBoolean was added (unbelievably) in the very 
next version from yours, 2.6.27 (see: 
http://mail.gnome.org/archives/xml/2006-October/msg00119.html).


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Mike Fowler

Robert Haas wrote:

On Tue, May 25, 2010 at 12:04 PM, Mike Fowler m...@mlfowler.com wrote:
  

Erik Rijkers wrote:


libxml2.x86_64  2.6.26-2.1.2.8  installed
libxml2-devel.x86_642.6.26-2.1.2.8  installed

  

Thanks for testing my patch Erik. It turns out I've got libxml2 installed at
version 2.7.5. Searching the gnome mailing lists, it turns out
xmlXPathCompiledEvalToBoolean was added (unbelievably) in the very next
version from yours, 2.6.27 (see:
http://mail.gnome.org/archives/xml/2006-October/msg00119.html).



We're unlikely to accept this patch if it changes the minimum version
of libxml2 required to compile PostgreSQL


Why? 2.6.27 is almost 4 years old.

I realise that my patch didn't update configure and configure.in, and 
indeed I didn't think of it when I responded to Erik (I'm too used to 
the Java world where people manage their own dependencies). I've now 
attached the updated patch which ups the check from version 2.6.23 to 
2.6.27.


Regards,

--
Mike Fowler
Registered Linux user: 379787

Index: configure
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/configure,v
retrieving revision 1.679
diff -c -r1.679 configure
*** configure	13 May 2010 22:07:40 -	1.679
--- configure	25 May 2010 16:57:49 -
***
*** 9079,9087 
  
  if test $with_libxml = yes ; then
  
! { $as_echo $as_me:$LINENO: checking for xmlSaveToBuffer in -lxml2 5
! $as_echo_n checking for xmlSaveToBuffer in -lxml2...  6; }
! if test ${ac_cv_lib_xml2_xmlSaveToBuffer+set} = set; then
$as_echo_n (cached)  6
  else
ac_check_lib_save_LIBS=$LIBS
--- 9079,9087 
  
  if test $with_libxml = yes ; then
  
! { $as_echo $as_me:$LINENO: checking for xmlXPathCompiledEvalToBoolean in -lxml2 5
! $as_echo_n checking for xmlXPathCompiledEvalToBoolean in -lxml2...  6; }
! if test ${ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean+set} = set; then
$as_echo_n (cached)  6
  else
ac_check_lib_save_LIBS=$LIBS
***
*** 9099,9109 
  #ifdef __cplusplus
  extern C
  #endif
! char xmlSaveToBuffer ();
  int
  main ()
  {
! return xmlSaveToBuffer ();
;
return 0;
  }
--- 9099,9109 
  #ifdef __cplusplus
  extern C
  #endif
! char xmlXPathCompiledEvalToBoolean ();
  int
  main ()
  {
! return xmlXPathCompiledEvalToBoolean ();
;
return 0;
  }
***
*** 9129,9140 
  	 test $cross_compiling = yes ||
  	 $as_test_x conftest$ac_exeext
 }; then
!   ac_cv_lib_xml2_xmlSaveToBuffer=yes
  else
$as_echo $as_me: failed program was: 5
  sed 's/^/| /' conftest.$ac_ext 5
  
! 	ac_cv_lib_xml2_xmlSaveToBuffer=no
  fi
  
  rm -rf conftest.dSYM
--- 9129,9140 
  	 test $cross_compiling = yes ||
  	 $as_test_x conftest$ac_exeext
 }; then
!   ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean=yes
  else
$as_echo $as_me: failed program was: 5
  sed 's/^/| /' conftest.$ac_ext 5
  
! 	ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean=no
  fi
  
  rm -rf conftest.dSYM
***
*** 9142,9150 
conftest$ac_exeext conftest.$ac_ext
  LIBS=$ac_check_lib_save_LIBS
  fi
! { $as_echo $as_me:$LINENO: result: $ac_cv_lib_xml2_xmlSaveToBuffer 5
! $as_echo $ac_cv_lib_xml2_xmlSaveToBuffer 6; }
! if test x$ac_cv_lib_xml2_xmlSaveToBuffer = xyes; then
cat confdefs.h _ACEOF
  #define HAVE_LIBXML2 1
  _ACEOF
--- 9142,9150 
conftest$ac_exeext conftest.$ac_ext
  LIBS=$ac_check_lib_save_LIBS
  fi
! { $as_echo $as_me:$LINENO: result: $ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean 5
! $as_echo $ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean 6; }
! if test x$ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean = xyes; then
cat confdefs.h _ACEOF
  #define HAVE_LIBXML2 1
  _ACEOF
***
*** 9152,9159 
LIBS=-lxml2 $LIBS
  
  else
!   { { $as_echo $as_me:$LINENO: error: library 'xml2' (version = 2.6.23) is required for XML support 5
! $as_echo $as_me: error: library 'xml2' (version = 2.6.23) is required for XML support 2;}
 { (exit 1); exit 1; }; }
  fi
  
--- 9152,9159 
LIBS=-lxml2 $LIBS
  
  else
!   { { $as_echo $as_me:$LINENO: error: library 'xml2' (version = 2.6.27) is required for XML support 5
! $as_echo $as_me: error: library 'xml2' (version = 2.6.27) is required for XML support 2;}
 { (exit 1); exit 1; }; }
  fi
  
Index: configure.in
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/configure.in,v
retrieving revision 1.627
diff -c -r1.627 configure.in
*** configure.in	13 May 2010 22:07:42 -	1.627
--- configure.in	25 May 2010 16:22:32 -
***
*** 940,946 
  fi
  
  if test $with_libxml = yes ; then
!   AC_CHECK_LIB(xml2, xmlSaveToBuffer, [], [AC_MSG_ERROR([library 'xml2' (version = 2.6.23) is required for XML support])])
  fi
  
  if test $with_libxslt = yes ; then
--- 940,946 
  fi
  
  if test $with_libxml = yes ; then
!   AC_CHECK_LIB(xml2

Re: [HACKERS] ecmascript 5 DATESTYLE

2010-05-19 Thread Mike Fowler

Pavel Stehule wrote:

2010/5/19 Peter Eisentraut pete...@gmx.net:
  

On tis, 2010-05-18 at 18:26 -0400, Ben Hockey wrote:


ecmascript 5 is the most recent specification for JavaScript and i
would think that having a DATESTYLE format to simplify
interoperability with JavaScript applications would be highly
desirable.
  

Note that we haven't got any other datestyles that are intended to
support interoperability with some language.  It is usually the job of
the client driver to convert PostgreSQL data (plural of datum) to the
appropriate type and format for the client environment or language.  Is
there any reason why JavaScript would be different?



I wouldn't be keen to see dedicated language specific handling of 
date/datetime formats. It would lead to an explosion of functions with 
new languages needing adding as and when their users jumped up and down 
on us. However a generic format could be very useful and would give the 
opportunity for people who need a language specific short cut the 
opportunity to do a CREATE FUNCTION wrapping the generic one with a hard 
coded format specifier.


Other platforms have generic support for this kind of task, for example 
SQLServer: http://msdn.microsoft.com/en-us/library/ms187928.aspx. I 
wouldn't recommend the SQLServer way, I think numeric format specifiers 
are clumsy. Perhaps a mechanism like Java which is nicely summarized 
here: 
http://java.sun.com/j2se/1.5.0/docs/api/java/text/SimpleDateFormat.html


Pavel: Why do you believe a generic format function would lead to SQL 
injections attacks?



JavaScript isn't special language, but JSON is wide used format for
interoperability. And same is true for XML datestyle format.

Regards
Pavel
  


I think that the postgres handling of those data types should handle the 
date encoding themselves. For example, a XMLELEMENT call that was passed 
a date would format the date string to the xs:date format (e.g. 
2010-05-19) and when passed a timestamp format to xs:datetime (e.g. 
2010-05-19T09:29:52+01:00). I would see the JSON handling as being no 
different.


Thanks,

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ecmascript 5 DATESTYLE

2010-05-19 Thread Mike Fowler

Pavel Stehule wrote:

see google: lateral sql injection oracle NLS_DATE_FORMAT

I would to like this functionality too - and technically I don't see a
problem - It's less than 100 lines, but I don't need a new security
problem. So my proposal is change nothing on this integrated
functionality and add new custom date type - like cdate that can be
customized via GUC.

Regards
Pavel


OK I found www.databasesecurity.com/dbsec/lateral-sql-injection.pdf. 
From the way I read this, the exploit relies on adjusting the 
NLS_DATE_FORMAT to an arbitrary string which is then used for the 
attack, To me this is easy to code against, simply lock the date format 
right down and ensure that it is always controlled. IMHO I don't see an 
Oracle specific attack as a reason why we can't have a generic format. 
Surely we can learn from this known vulnerability and get another one up 
on Oracle?


Thanks,

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ecmascript 5 DATESTYLE

2010-05-19 Thread Mike Fowler

Pavel Stehule wrote:

2010/5/19 Mike Fowler m...@mlfowler.com:
  

Pavel Stehule wrote:


see google: lateral sql injection oracle NLS_DATE_FORMAT

I would to like this functionality too - and technically I don't see a
problem - It's less than 100 lines, but I don't need a new security
problem. So my proposal is change nothing on this integrated
functionality and add new custom date type - like cdate that can be
customized via GUC.

Regards
Pavel
  

OK I found www.databasesecurity.com/dbsec/lateral-sql-injection.pdf. From
the way I read this, the exploit relies on adjusting the NLS_DATE_FORMAT to
an arbitrary string which is then used for the attack, To me this is easy to
code against, simply lock the date format right down and ensure that it is
always controlled. IMHO I don't see an Oracle specific attack as a reason
why we can't have a generic format. Surely we can learn from this known
vulnerability and get another one up on Oracle?



I am not a security expert - you can simply don't allow apostrophe,
double quotes - but I am not sure, if this can be safe - simply - I am
abe to write this patch, but I am not able to ensure security.

Regards
Pavel
  


Well you've rightly identified a potential security hole, so my 
recommendation would be to put the patch together bearing in mind the 
Oracle vulnerability. Once you've submitted the patch it can be reviewed 
and we can ensure that you've managed to steer clear of introducing the 
same/similar vulnerability into postgres.


Am I right in thinking that you're now proposing to do the generic patch 
that Robert Haas and I prefer?


Thanks,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding XML Schema validation (XMLVALIDATE)

2010-05-19 Thread Mike Fowler

Hi,

I'm going to start work on another XML todo item:

Add XML Schema validation and xmlvalidate function (SQL:2008)

The standard identifies XMLVALIDATE as:

XML validate ::=
   XMLVALIDATE left paren
  document or content or sequence
  XML value expression
  [ XML valid according to clause]
  right paren

so I've got something quite clear to work too. libxml has the required 
support for schema validation so I'll just be wrapping it's 
functionality much like I did for xpath_exists().


Anyone got any thoughts before I get busy?

Thanks,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding xpath_exists function

2010-05-11 Thread Mike Fowler

Robert Haas wrote:

Please email your patch to the list (replying to this email is fine)
and add it here:
https://commitfest.postgresql.org/action/commitfest_view/open
  
Here's my patch, developed against HEAD, that adds the function 
'xpath_exists'. The function is a lot simpler than originally thought, 
so none of the string manipulation previously discussed was required. 
I've also included some regression tests that test the function with and 
without xml namespaces. I should  note that before I added my tests all 
existing tests passed.


One observation that can be made is that I've largely copied the 
existing xpath function and altered it to use a different method from 
the libxml API. I've done it to save me redoing all the namespace 
handling, however it's apparent to me that if we wanted to expose more 
of the libxml api we will quickly start having a lot of duplicate code. 
I notice that refactoring existing code whilst adding new code is 
generally frowned upon, so once this patch is accepted I will look to 
refactor the xpath and xpath_exists function. I could even add an 
xpath_count method at the same time ;) .


Thanks in advance for any and all feedback,

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'

Index: src/backend/utils/adt/xml.c
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.97
diff -c -r1.97 xml.c
*** src/backend/utils/adt/xml.c	3 Mar 2010 17:29:45 -	1.97
--- src/backend/utils/adt/xml.c	11 May 2010 07:54:53 -
***
*** 3495,3497 
--- 3495,3670 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	ArrayBuildState *astate = NULL;
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			res_nitems;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate parser context);
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		could

Re: [HACKERS] Adding xpath_exists function

2010-05-07 Thread Mike Fowler

Robert Haas wrote:

Oh, I see.  Well, that might be reasonable syntactic sugar, although I
think you should make it wrap the path in exists() unconditionally,
rather than testing for an existing wrap.
  


I'll leave it out for now, it saves me some effort after all.


Please email your patch to the list (replying to this email is fine)
and add it here:
https://commitfest.postgresql.org/action/commitfest_view/open
  


Will do once I've finished. Thanks for your feedback.

Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding xpath_exists function

2010-05-06 Thread Mike Fowler

Hi hackers,

Although this is a very small change I figured I'd practice the policy 
of outlining your change before you write the code and attempt a patch 
submission. Essentially I see the function as a convenience function 
that exposes the results of the xpath built in exists() function as a 
boolean for easier SQL. The syntax will match the xpath function already 
present:


xpath_exists(xpath, xml[, nsarray])

The implementation will check that the xpath value contains 
'exists(.)' and add it if not present for added usability. I can't 
blindly wrap the value with exists(...) as exists(exists(.)) will 
always return true in xpath. I've not allocated the oid yet, but will 
try to earn my bonus points for getting it close to xpath()'s oid. :) 
Appropriate regression tests will be added after I've ensured that 'make 
check' is happy I've not regressed anything.


Can I confirm that contrib/xml2 is deprecated and I should be carrying 
out my work in backend/utils/adt/xml.c?


I shall be coding it up over the next day or two, work  family permitting!

Thanks,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] XML Todo List

2010-05-02 Thread Mike Fowler

Hackers,

I'm interested in tackling some of the todo items in XML category. Being 
new to postgres hacking I'm hoping I chose an item that isn't more than 
I can chew in the first sitting. One item that has caught my eye that I 
(naively) hope isn't a huge todo is:


xpath_exists() is needed. It checks, whether or not the path specified 
exists in the XML value. (W/o this function we need to use weird 
array_dims(xpath(...)) IS NOT NULL syntax.)


Is any one else working on the XML todos who might have some friendly 
pointers to help me on my way or am I just better off getting some code 
together for general review?


Thanks,

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XML Todo List

2010-04-29 Thread Mike Fowler

Peter Eisentraut wrote:

On ons, 2010-04-28 at 15:21 +0100, Mike Fowler wrote:
  
xpath_exists() is needed. It checks, whether or not the path specified 
exists in the XML value. (W/o this function we need to use weird 
array_dims(xpath(...)) IS NOT NULL syntax.)



That sounds like a reasonable project.

  
Is any one else working on the XML todos who might have some friendly 
pointers to help me on my way or am I just better off getting some code 
together for general review?



I think you're it. :)
  

Great, I won't be stepping on any toes!

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] XML Todo List

2010-04-28 Thread Mike Fowler

Hackers,

I'm interested in tackling some of the todo items in XML category. Being 
new to postgres hacking I'm hoping I chose an item that isn't more than 
I can chew in the first sitting. One item that has caught my eye that I 
(naively) hope isn't a huge todo is:


xpath_exists() is needed. It checks, whether or not the path specified 
exists in the XML value. (W/o this function we need to use weird 
array_dims(xpath(...)) IS NOT NULL syntax.)


Is any one else working on the XML todos who might have some friendly 
pointers to help me on my way or am I just better off getting some code 
together for general review?


Thanks,

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers