Re: [HACKERS] remove contrib/xml2
Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Tom Lane wrote: ... The reason for that behavior is that xpath_table runs through the XPATH_NODESET results generated by the various XPaths and dumps the k'th one of each into the k'th output row generated for the current input row. ISTM the missing piece is really in our API. We need to be able to specify a nodeset to iterate over, and then for each node take the first value produced by each xpath expression. So the example above would look something like: SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row', '@a|@b', 'true') as t(id int4, a text, b text); Hm. It seems like that still leaves you open to the possibility of out-of-sync results. If you consider the current behavior as what you'd get with an empty root nodeset spec, then restricting it to produce only the first output row doesn't help at all -- it would still associate 1 with oops. In general if the nodeset spec doesn't select a unique subnode then you're at risk of bogus answers. Maybe that could be defined as user error but it sure seems like it would be error-prone to use. Well, I think that's going to be hard or impossible to avoid in the general case. My suggestion was intended to give the user a much better chance of avoiding it, however. Arbitrary XML (or JSON or YAML or any artbitrarilly tree structured data markup) doesn't map well to a rectangular structure, and this is always likely to cause problems like this IMO. I guess in the end the user could preprocess the XML with XSLT to remove the irregularities before passing to to xpath_table. We certainly need to put some warnings in the docs about it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
I believe I have fixed all the reported crashes in contrib/xml2. However there is still this issue pointed out by Robert: CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'rowlistrow a=1/row a=2 b=oops//rowlist'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text); which yields an answer that is, at least, extremely surprising, if not flat-out wrong: id | a | b +---+-- 1 | 1 | oops 1 | 2 | (2 rows) the point being that it seems like oops should be associated with 2 not 1. The reason for that behavior is that xpath_table runs through the XPATH_NODESET results generated by the various XPaths and dumps the k'th one of each into the k'th output row generated for the current input row. If there is any way to synchronize which node in each array goes with each node in each other array, it's not apparent to me, but I don't know libxml's API at all. Perhaps there is some other call we should be using to evaluate all the XPaths in parallel? (The code is also unbelievably inefficient, recompiling each XPath expression once per output row (!); but it doesn't seem worth fixing that right away given that we might have to throw away the logic entirely in order to fix this bug.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Tom Lane wrote: I believe I have fixed all the reported crashes in contrib/xml2. Yay! Well done! That at least removes any possibly urgency about removing the module. However there is still this issue pointed out by Robert: CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'rowlistrow a=1/row a=2 b=oops//rowlist'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text); which yields an answer that is, at least, extremely surprising, if not flat-out wrong: id | a | b +---+-- 1 | 1 | oops 1 | 2 | (2 rows) the point being that it seems like oops should be associated with 2 not 1. The reason for that behavior is that xpath_table runs through the XPATH_NODESET results generated by the various XPaths and dumps the k'th one of each into the k'th output row generated for the current input row. If there is any way to synchronize which node in each array goes with each node in each other array, it's not apparent to me, but I don't know libxml's API at all. Perhaps there is some other call we should be using to evaluate all the XPaths in parallel? (The code is also unbelievably inefficient, recompiling each XPath expression once per output row (!); but it doesn't seem worth fixing that right away given that we might have to throw away the logic entirely in order to fix this bug.) Damn that's ugly. ISTM the missing piece is really in our API. We need to be able to specify a nodeset to iterate over, and then for each node take the first value produced by each xpath expression. So the example above would look something like: SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row', '@a|@b', 'true') as t(id int4, a text, b text); Maybe we could approximate that with the current API by factoring out the common root of the xpath expressions, but that's likely to be extremely fragile and error prone, and we've already got bad experience of trying to be too cute with xpath expressions. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Andrew Dunstan and...@dunslane.net writes: Tom Lane wrote: ... The reason for that behavior is that xpath_table runs through the XPATH_NODESET results generated by the various XPaths and dumps the k'th one of each into the k'th output row generated for the current input row. Damn that's ugly. Yup :-( ISTM the missing piece is really in our API. We need to be able to specify a nodeset to iterate over, and then for each node take the first value produced by each xpath expression. So the example above would look something like: SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row', '@a|@b', 'true') as t(id int4, a text, b text); Hm. It seems like that still leaves you open to the possibility of out-of-sync results. If you consider the current behavior as what you'd get with an empty root nodeset spec, then restricting it to produce only the first output row doesn't help at all -- it would still associate 1 with oops. In general if the nodeset spec doesn't select a unique subnode then you're at risk of bogus answers. Maybe that could be defined as user error but it sure seems like it would be error-prone to use. Maybe we could approximate that with the current API by factoring out the common root of the xpath expressions, but that's likely to be extremely fragile and error prone, and we've already got bad experience of trying to be too cute with xpath expressions. Agreed, we do not want to be doing textual manipulations of XPaths, which is what burnt us before. But does libxml2 offer any more abstract path representation we could work on? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
M Z escribió: Hi Alvaro, I followed your instruction but put the patch on 8.4.2 as I found it crashes. It looks like the server still crash in the same way. Can you and anyone give me some ideas how to fix this bug? See src/backend/utils/adt/xml.c. Note the comment at the top: /* * Notes on memory management: * * Sometimes libxml allocates global structures in the hope that it can reuse * them later on. This makes it impractical to change the xmlMemSetup * functions on-the-fly; that is likely to lead to trying to pfree() chunks * allocated with malloc() or vice versa. Since libxml might be used by * loadable modules, eg libperl, our only safe choices are to change the * functions at postmaster/backend launch or not at all. Since we'd rather * not activate libxml in sessions that might never use it, the latter choice * is the preferred one. However, for debugging purposes it can be awfully * handy to constrain libxml's allocations to be done in a specific palloc * context, where they're easy to track. Therefore there is code here that * can be enabled in debug builds to redirect libxml's allocations into a * special context LibxmlContext. It's not recommended to turn this on in * a production build because of the possibility of bad interactions with * external modules. */ /* #define USE_LIBXMLCONTEXT */ Then if you look at xpath.c in contrib/xml2 you notice that it's doing exactly the thing that the core module says it's unreliable: using palloc and friends in xmlMemSetup. So to fix the bug what's needed is that the xmlMemSetup call in contrib is removed altogether, and all memory is tracked and released by hand. It's rather tedious, and it's also difficult to plug all resulting memory leaks. But AFAIUI doing that would fix (some of?) the crashes. Not sure if your crash is in this category. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Alvaro Herrera alvhe...@commandprompt.com writes: Then if you look at xpath.c in contrib/xml2 you notice that it's doing exactly the thing that the core module says it's unreliable: using palloc and friends in xmlMemSetup. So to fix the bug what's needed is that the xmlMemSetup call in contrib is removed altogether, and all memory is tracked and released by hand. It's rather tedious, and it's also difficult to plug all resulting memory leaks. But AFAIUI doing that would fix (some of?) the crashes. Not sure if your crash is in this category. FWIW, the core xml code seems to have been pretty stable since we gave up on trying to redirect libxml's memory allocations to palloc. So what you basically need to do to xpath.c is something like this: http://archives.postgresql.org/pgsql-committers/2009-05/msg00229.php regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Tom Lane t...@sss.pgh.pa.us writes: FWIW, the core xml code seems to have been pretty stable since we gave up on trying to redirect libxml's memory allocations to palloc. So what you basically need to do to xpath.c is something like this: http://archives.postgresql.org/pgsql-committers/2009-05/msg00229.php Which translates to this git URL, for lazy browsers such as me: http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=f8059d7f8ae8bfac840fbda3c8efbc0f7c09b123 -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Hi Alvaro, I followed your instruction but put the patch on 8.4.2 as I found it crashes. It looks like the server still crash in the same way. Can you and anyone give me some ideas how to fix this bug? == conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t xml); CREATE TABLE conifer=# INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); INSERT 0 1 conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. ! == Best, M Z CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); Hmm. Well, all I know is that the first thing I tried crashed the server. This trivial patch lingering on my system fixes this crasher (this is for the 8.3 branch). It makes the problem in alloc set ExprContext warning show up instead. There are still lotsa other holes, but hey, this is a start ... Index: contrib/xml2/xpath.c === RCS file: /home/alvherre/Code/cvs/pgsql/contrib/xml2/xpath.c,v retrieving revision 1.16.2.1 diff -c -p -r1.16.2.1 xpath.c *** contrib/xml2/xpath.c26 Mar 2008 01:19:11 - 1.16.2.1 --- contrib/xml2/xpath.c27 Jan 2010 15:30:56 - *** xpath_table(PG_FUNCTION_ARGS) *** 793,798 --- 793,801 */ pgxml_parser_init(); + PG_TRY(); + { + /* For each row i.e. document returned from SPI */ for (i = 0; i proc; i++) { *** xpath_table(PG_FUNCTION_ARGS) *** 929,934 --- 932,944 if (xmldoc) pfree(xmldoc); } + } + PG_CATCH(); + { + xmlCleanupParser(); + PG_RE_THROW(); + } + PG_END_TRY(); xmlCleanupParser(); /* Needed to flag completeness in 7.3.1. 7.4 defines it as a no-op. */ -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
The thing is, why it doesn't crash on 8.3.8 but crash on 8.4.2? Any idea? A patch was applied to 8.3 but not to 8.4.2? Thanks, M Z On Fri, Feb 5, 2010 at 1:45 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Feb 3, 2010 at 8:49 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: (2) add a very, very large warning that this will crash if you do almost anything with it. I think that's an exaggeration. Certain people are known to be using it quite successfully. Hmm. Well, all I know is that the first thing I tried crashed the server. CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); This trivial patch lingering on my system fixes this crasher (this is for the 8.3 branch). It makes the problem in alloc set ExprContext warning show up instead. There are still lotsa other holes, but hey, this is a start ... Interestingly M Z found he couldn't reproduce this crash on 8.3. Can you? If so, +1 for applying this and backpatching it as far as make sense. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
M Z jm80...@gmail.com writes: The thing is, why it doesn't crash on 8.3.8 but crash on 8.4.2? Any idea? Pure luck. Memory-clobber bugs like these are notoriously nondeterministic. Any minor, logically unrelated change could make them visible or not visible, because the clobber happens to clobber data that is or is not in active use at the moment. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Thu, Feb 4, 2010 at 10:51 PM, M Z jm80...@gmail.com wrote: I did some tests followed Robert's test cases on both postgresql 8.4.2-0ubu and 8.3.8-1, OS: Ubuntu Karmic. 1) 1st test case, it doesn't crash on 8.3.8 but crash on 8.4.2; Interesting. So, that's a regression of some kind. 2) 2nd test case, both 8.3.8 and 8.4.2 are fine, and no warning (different from Robert's test?); I built with --enable-debug and --enable-cassert, which might be relevant. 3) 3rd test case (and modified test case for 8.3.8), both 8.3.8 and 8.4.2 are not correct, same with Robert's test (8.5 beta?); As I think about that further, it might not be a bug - how is the processor supposed to know what we expect to happen? But then, I don't really know how this is supposed to work. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Wed, Feb 3, 2010 at 8:49 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: (2) add a very, very large warning that this will crash if you do almost anything with it. I think that's an exaggeration. Certain people are known to be using it quite successfully. Hmm. Well, all I know is that the first thing I tried crashed the server. CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); This trivial patch lingering on my system fixes this crasher (this is for the 8.3 branch). It makes the problem in alloc set ExprContext warning show up instead. There are still lotsa other holes, but hey, this is a start ... Interestingly M Z found he couldn't reproduce this crash on 8.3. Can you? If so, +1 for applying this and backpatching it as far as make sense. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
I did some tests followed Robert's test cases on both postgresql 8.4.2-0ubu and 8.3.8-1, OS: Ubuntu Karmic. 1) 1st test case, it doesn't crash on 8.3.8 but crash on 8.4.2; 2) 2nd test case, both 8.3.8 and 8.4.2 are fine, and no warning (different from Robert's test?); 3) 3rd test case (and modified test case for 8.3.8), both 8.3.8 and 8.4.2 are not correct, same with Robert's test (8.5 beta?); * 1st test case: == 8.3.8 == conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t xml); CREATE TABLE conifer=# INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); INSERT 0 1 conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); id 1 (1 row) == 8.4.2 == conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t xml); CREATE TABLE conifer=# INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); INSERT 0 1 conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. ! * * 2nd test case == 8.3.8 and 8.4.2 == conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t text); CREATE TABLE conifer=# INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); INSERT 0 1 conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); id 1 (1 row) * * 3rd test case == 8.3.8 and 8.4.2 == conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t text); CREATE TABLE conifer=# INSERT INTO xpath_test VALUES (1, 'rowlistrow a=1/row a=2 b=oops//rowlist'); INSERT 0 1 conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text); id | a | b +---+-- 1 | 1 | oops 1 | 2 | (2 rows) == 8.3.8 (modified 3rd test case, because 8.3.8 won't crash using xml) == conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t xml); CREATE TABLE conifer=# INSERT INTO xpath_test VALUES (1, 'rowlistrow a=1/row a=2 b=oops//rowlist'); INSERT 0 1 conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text); id | a | b +---+-- 1 | 1 | oops 1 | 2 | (2 rows) * For 1st test case, not sure if some paths applied to 8.3 haven't been applied to 8.4, or other reasons cause the difference between 8.3.8 and 8.4.2. Any ideas or comments? Thank you, M Z On Mon, Feb 1, 2010 at 8:44 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: (2) add a very, very large warning that this will crash if you do almost anything with it. I think that's an exaggeration. Certain people are known to be using it quite successfully. Hmm. Well, all I know is that the first thing I tried crashed the server. CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); It doesn't crash if you change the type of t from xml to text; instead you get a warning about some sort of memory allocation problem. DROP TABLE xpath_test; CREATE TABLE xpath_test (id integer NOT NULL, t text); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); yields: WARNING: problem in alloc set ExprContext: bogus aset link in block 0x14645e0, chunk 0x14648b8 And then there's this (see also bug #5285): DELETE FROM xpath_test; INSERT INTO xpath_test VALUES (1, 'rowlistrow a=1/row a=2 b=oops//rowlist'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text); which yields an answer that is, at least, extremely surprising, if not flat-out wrong: id | a | b +---+-- 1 | 1 | oops 1 | 2 | (2 rows) Bugs #4953 and #5079 can also be reproduced in CVS HEAD. Both crash the server. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Robert Haas escribió: On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: (2) add a very, very large warning that this will crash if you do almost anything with it. I think that's an exaggeration. Certain people are known to be using it quite successfully. Hmm. Well, all I know is that the first thing I tried crashed the server. CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); This trivial patch lingering on my system fixes this crasher (this is for the 8.3 branch). It makes the problem in alloc set ExprContext warning show up instead. There are still lotsa other holes, but hey, this is a start ... Index: contrib/xml2/xpath.c === RCS file: /home/alvherre/Code/cvs/pgsql/contrib/xml2/xpath.c,v retrieving revision 1.16.2.1 diff -c -p -r1.16.2.1 xpath.c *** contrib/xml2/xpath.c26 Mar 2008 01:19:11 - 1.16.2.1 --- contrib/xml2/xpath.c27 Jan 2010 15:30:56 - *** xpath_table(PG_FUNCTION_ARGS) *** 793,798 --- 793,801 */ pgxml_parser_init(); + PG_TRY(); + { + /* For each row i.e. document returned from SPI */ for (i = 0; i proc; i++) { *** xpath_table(PG_FUNCTION_ARGS) *** 929,934 --- 932,944 if (xmldoc) pfree(xmldoc); } + } + PG_CATCH(); + { + xmlCleanupParser(); + PG_RE_THROW(); + } + PG_END_TRY(); xmlCleanupParser(); /* Needed to flag completeness in 7.3.1. 7.4 defines it as a no-op. */ -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Thu, Jan 28, 2010 at 5:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: Robert Haas wrote: I think we need to either (1) fix the bugs and update the documentation to remove the statement that this will be removed or (2) actually remove it. I agree it's a mess but I don't think just abandoning the functionality is a good idea. Yeah, we can't really remove it until we have a plausible substitute for the xpath_table functionality. This is in the TODO list ... My feeling is that if it's as flakey and unreliable as it currently is, we shouldn't ship it. Removing it from CVS doesn't mean you can't use this any more; this is open source. It just means people will have to go and get an old copy out of CVS and presumably in so doing they will be aware that we've removed it for a reason. We have a well-deserved reputation for quality and I would like to see us preserve that. Is anyone going to try to fix this for 9.0? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Robert Haas robertmh...@gmail.com writes: My feeling is that if it's as flakey and unreliable as it currently is, we shouldn't ship it. Removing it from CVS doesn't mean you can't use this any more; this is open source. It just means people will have to go and get an old copy out of CVS and presumably in so doing they will be aware that we've removed it for a reason. We have a well-deserved reputation for quality and I would like to see us preserve that. [ shrug... ] It is not any more flaky than it's been since it was put in. The people who have been depending on it presumably have use-patterns for which it doesn't fail, and we're not going to be doing them a service by ripping out functionality for which we can't offer a replacement. As for the quality argument, contrib modules are not guaranteed to be of the same standard as the core code; anyone who thinks they are should disabuse themselves of the notion by reading some of that code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Mon, Feb 1, 2010 at 1:38 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: My feeling is that if it's as flakey and unreliable as it currently is, we shouldn't ship it. Removing it from CVS doesn't mean you can't use this any more; this is open source. It just means people will have to go and get an old copy out of CVS and presumably in so doing they will be aware that we've removed it for a reason. We have a well-deserved reputation for quality and I would like to see us preserve that. [ shrug... ] It is not any more flaky than it's been since it was put in. The people who have been depending on it presumably have use-patterns for which it doesn't fail, and we're not going to be doing them a service by ripping out functionality for which we can't offer a replacement. Well, then we'd at least better update the documentation to (1) remove the statement that this will be removed in 8.4 (since we didn't), and (2) add a very, very large warning that this will crash if you do almost anything with it. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Robert Haas wrote: (2) add a very, very large warning that this will crash if you do almost anything with it. I think that's an exaggeration. Certain people are known to be using it quite successfully. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: (2) add a very, very large warning that this will crash if you do almost anything with it. I think that's an exaggeration. Certain people are known to be using it quite successfully. Hmm. Well, all I know is that the first thing I tried crashed the server. CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); It doesn't crash if you change the type of t from xml to text; instead you get a warning about some sort of memory allocation problem. DROP TABLE xpath_test; CREATE TABLE xpath_test (id integer NOT NULL, t text); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); yields: WARNING: problem in alloc set ExprContext: bogus aset link in block 0x14645e0, chunk 0x14648b8 And then there's this (see also bug #5285): DELETE FROM xpath_test; INSERT INTO xpath_test VALUES (1, 'rowlistrow a=1/row a=2 b=oops//rowlist'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text); which yields an answer that is, at least, extremely surprising, if not flat-out wrong: id | a | b +---+-- 1 | 1 | oops 1 | 2 | (2 rows) Bugs #4953 and #5079 can also be reproduced in CVS HEAD. Both crash the server. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Robert Haas wrote: There has been some more discussion lately of problems caused by contrib/xml2. http://archives.postgresql.org/pgsql-bugs/2010-01/msg00251.php http://archives.postgresql.org/pgsql-bugs/2010-01/msg00198.php I think we need to either (1) fix the bugs and update the documentation to remove the statement that this will be removed or (2) actually remove it. Nobody seems interested in #1, so PFA a patch to do #2. It also rips out all the libxslt stuff, which seems to exist only for the purpose of supporting contrib/xml2. The problem is that there are people who use the XSLT and xpath_table stuff on text data and so don't run into these bugs. I agree it's a mess but I don't think just abandoning the functionality is a good idea. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Thu, Jan 28, 2010 at 4:18 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: There has been some more discussion lately of problems caused by contrib/xml2. http://archives.postgresql.org/pgsql-bugs/2010-01/msg00251.php http://archives.postgresql.org/pgsql-bugs/2010-01/msg00198.php I think we need to either (1) fix the bugs and update the documentation to remove the statement that this will be removed or (2) actually remove it. Nobody seems interested in #1, so PFA a patch to do #2. It also rips out all the libxslt stuff, which seems to exist only for the purpose of supporting contrib/xml2. The problem is that there are people who use the XSLT and xpath_table stuff on text data and so don't run into these bugs. I agree it's a mess but I don't think just abandoning the functionality is a good idea. I'm one of those people. :) Expecting to see contrib/xml2 go away at some point, possibly without replacements for xslt_process and xpath_table, I've been working on some plpgsql and plperlu work-alikes targeted at TEXT columns, as the xml2 versions do. I hope these (attached) will be of some help to others. Note, these are not the exact functions I use, they are lightly edited to remove the use of wrappers I've created to paper over the transition from xpath_nodeset() to core XPATH(). -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com xml2-replacements.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Andrew Dunstan and...@dunslane.net writes: Robert Haas wrote: I think we need to either (1) fix the bugs and update the documentation to remove the statement that this will be removed or (2) actually remove it. I agree it's a mess but I don't think just abandoning the functionality is a good idea. Yeah, we can't really remove it until we have a plausible substitute for the xpath_table functionality. This is in the TODO list ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Yeah, we can't really remove it until we have a plausible substitute for the xpath_table functionality. This is in the TODO list ... What about moving it to pgfoundry? I'm really not keen on shipping known-broken stuff in /contrib. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Thu, Jan 28, 2010 at 5:54 PM, Josh Berkus j...@agliodbs.com wrote: Yeah, we can't really remove it until we have a plausible substitute for the xpath_table functionality. This is in the TODO list ... What about moving it to pgfoundry? I'm really not keen on shipping known-broken stuff in /contrib. Yeah, exactly. Another option - if we know that it works OK with inputs of certain types - might be to throw an error if we get an input of a type we know it doesn't work with. But I guess I haven't followed this closely enough to know exactly what kinds of arguments do/don't work. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
Robert Haas escribió: On Thu, Jan 28, 2010 at 5:54 PM, Josh Berkus j...@agliodbs.com wrote: Yeah, we can't really remove it until we have a plausible substitute for the xpath_table functionality. This is in the TODO list ... What about moving it to pgfoundry? I'm really not keen on shipping known-broken stuff in /contrib. Yeah, exactly. Another option - if we know that it works OK with inputs of certain types - might be to throw an error if we get an input of a type we know it doesn't work with. But I guess I haven't followed this closely enough to know exactly what kinds of arguments do/don't work. If it were easy to throw an error, it'd be better to avoid the crash. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers