Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-07 Thread David E. Wheeler
On Aug 6, 2010, at 10:49 PM, Pavel Stehule wrote:

 Huh? You can select into an array:
 
 and pg doesn't handle 2D arrays well - can't to use ARRAY(subselect)
 constructor for 2D arrays

Right.

 try=# select ARRAY(SELECT ARRAY[k,v] FROM foo);
 ERROR:  could not find array type for datatype text[]
 
 try SELECT ARRAY(SELECT row(k,v) FROM foo)

Yeah, but those aren't nested arrays., They're…well, they're ordered pairs. ;-P

 sure, but it isn't relevant here - the problem is buildin output
 functions for datatypes. For example - true is different formated in
 PostgresSQL and different formated in xml or JSON. Date values are
 differently formated in JSON and XML. So if you would to correctly
 format some date type value and if your interface is only text - then
 you have to cast value back to binary and format it again. More - if
 you have a information about original data type, you can use a corect
 format. So if you use a only text parameters, then you lost a
 significant information (when some parameter are not text). For
 example, if I have only text interface for some hypothetical JSON API,
 then I am not able to show a boolean value correctly - because it
 doesn't use a quoting - and it is string and isn't number.

Point. FWIW, though, this is already an issue for non-SQL functions. PL/Perl, 
for example, gets all arguments cast to text, AFAICT:

try=# create or replace function try(bool) returns text language plperl AS 
'shift';
CREATE FUNCTION
Time: 121.403 ms
try=# select try(true);
 try 
-
 t
(1 row)

I wish this wasn't so.

 There is some other issue - PLpgSQL can't to work well with untyped
 collections. But it isn't problem for C custom functions, and there
 are not any reason why we can't to support polymorphic collections
 (+/- because these collection cannot be accessed from PLpgSQL
 directly).

I completely agree with you here. I'd love to be able to support RECORD 
arguments to non-C functions.

 I agree that it's not as sugary as pairs would be. But I admit to having no 
 problem with
 
  SELECT foo(ARRAY[ ['foo', 'bar'], ['baz', 'yow']]);
 
 But maybe I'm biased, since there's a lot of that sort of syntax in pgTAP..
 
 
 Yes, when you are a author of code, you know what you are wrote. But
 when you have do some review? Then an reviewer have to look on
 definition of foo, and he has to verify, if you are use a parameters
 well. For two params I don't see on first view what system you used -
 [[key,key],[value,value]] or [[key,value],[key, value]]. More you have
 to use a nested data structure - what is less readable then variadic
 parameters. And - in pg - you are lost information about original data
 types.

Valid points. I agree that it would be nicer to use RECORDs:

SELECT foo( row('foo', 1), row('bar', true));

Certainly much clearer. But given that we've gone round and round on allowing 
non-C functions to use ROWs and gotten nowhere, I don't know that we'll get any 
further now. But can you not create a C function that allows a signature of 
VARIADIC RECORD?

Best,

David











-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] Trouble with COPY IN

2010-08-07 Thread Kris Jurka



On Fri, 6 Aug 2010, James William Pye wrote:


On Aug 6, 2010, at 4:31 PM, Kris Jurka wrote:

binary-copy-end-v2.patch


I think there's a snag in the patch:

postgres=# COPY data FROM '/Users/jwp/DATA.bcopy' WITH BINARY;
ERROR:  row field count is -1, expected 1
CONTEXT:  COPY data, line 4

Probably a quick/small fix away, I imagine.


Hmm, not quite sure why that is.  That seems to imply that it's not using 
V3 protocol, but I thought binary copy could only be used with the V3 
protocol.  In any case, I think this new patch is more bulletproof.


Kris Jurka
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***
*** 2058,2069  CopyFrom(CopyState cstate)
int16   fld_count;
ListCell   *cur;
  
!   if (!CopyGetInt16(cstate, fld_count) ||
!   fld_count == -1)
{
done = true;
break;
}
  
if (fld_count != attr_count)
ereport(ERROR,
--- 2058,2090 
int16   fld_count;
ListCell   *cur;
  
!   if (!CopyGetInt16(cstate, fld_count))
{
done = true;
break;
}
+   
+   if (fld_count == -1)
+   {
+   /*
+* Reached EOF.  In protocol version 3, we must 
wait for
+* the protocol end of copy (CopyDone/Fail).  
If we
+* receive any more copy data after EOF, 
complain.
+*/
+   if (cstate-copy_dest == COPY_NEW_FE)
+   {
+   int8 unused;
+   if (CopyGetData(cstate, unused, 
sizeof(unused), sizeof(unused)))
+   {
+   ereport(ERROR,
+   
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+
errmsg(received copy data after EOF marker)));
+   }
+   }
+ 
+   done = true;
+   break;
+   }
  
if (fld_count != attr_count)
ereport(ERROR,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-07 Thread Pavel Stehule
2010/8/7 David E. Wheeler da...@kineticode.com:
 On Aug 6, 2010, at 10:49 PM, Pavel Stehule wrote:

 Huh? You can select into an array:

 and pg doesn't handle 2D arrays well - can't to use ARRAY(subselect)
 constructor for 2D arrays

 Right.

 try=# select ARRAY(SELECT ARRAY[k,v] FROM foo);
 ERROR:  could not find array type for datatype text[]

 try SELECT ARRAY(SELECT row(k,v) FROM foo)

 Yeah, but those aren't nested arrays., They're…well, they're ordered pairs. 
 ;-P

 sure, but it isn't relevant here - the problem is buildin output
 functions for datatypes. For example - true is different formated in
 PostgresSQL and different formated in xml or JSON. Date values are
 differently formated in JSON and XML. So if you would to correctly
 format some date type value and if your interface is only text - then
 you have to cast value back to binary and format it again. More - if
 you have a information about original data type, you can use a corect
 format. So if you use a only text parameters, then you lost a
 significant information (when some parameter are not text). For
 example, if I have only text interface for some hypothetical JSON API,
 then I am not able to show a boolean value correctly - because it
 doesn't use a quoting - and it is string and isn't number.

 Point. FWIW, though, this is already an issue for non-SQL functions. PL/Perl, 
 for example, gets all arguments cast to text, AFAICT:

 try=# create or replace function try(bool) returns text language plperl AS 
 'shift';
 CREATE FUNCTION
 Time: 121.403 ms
 try=# select try(true);
  try
 -
  t
 (1 row)

 I wish this wasn't so.


It must not be - it depends on PL handler implementation. PostgreSQL
call PL handler with binary values. I am thinking so new Python PL can
do it well.

 There is some other issue - PLpgSQL can't to work well with untyped
 collections. But it isn't problem for C custom functions, and there
 are not any reason why we can't to support polymorphic collections
 (+/- because these collection cannot be accessed from PLpgSQL
 directly).

 I completely agree with you here. I'd love to be able to support RECORD 
 arguments to non-C functions.

 I agree that it's not as sugary as pairs would be. But I admit to having no 
 problem with

  SELECT foo(ARRAY[ ['foo', 'bar'], ['baz', 'yow']]);

 But maybe I'm biased, since there's a lot of that sort of syntax in pgTAP..


 Yes, when you are a author of code, you know what you are wrote. But
 when you have do some review? Then an reviewer have to look on
 definition of foo, and he has to verify, if you are use a parameters
 well. For two params I don't see on first view what system you used -
 [[key,key],[value,value]] or [[key,value],[key, value]]. More you have
 to use a nested data structure - what is less readable then variadic
 parameters. And - in pg - you are lost information about original data
 types.

 Valid points. I agree that it would be nicer to use RECORDs:

    SELECT foo( row('foo', 1), row('bar', true));

I am not absolutly satisfied - but it's better, than arrays.


 Certainly much clearer. But given that we've gone round and round on allowing 
 non-C functions to use ROWs and gotten nowhere, I don't know that we'll get 
 any further now. But can you not create a C function that allows a signature 
 of VARIADIC RECORD?

you can do a variadic over ROW type. We have not a polymorphic arrays
- so isn't possible to write VARIADIC RECORD now. It could be a nice
if we are to define a own composite types with polymorphic fields.
Then you can do:

CREATE TYPE pair AS (key text, value any);
CREATE FUNCTION foo(VARIADIC pair[])

other idea is leave arrays - and thinking about key, value collection
as new kind of data types. so maybe

CREATE FUNCTION foo(VARIADIC params COLECTION OF text WITH UNIQUE text KEY)

Regards

Pavel



 Best,

 David












-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising dead_tuple_count from pgstattuple

2010-08-07 Thread Heikki Linnakangas

On 07/08/10 07:43, Gordon Shannon wrote:

Regarding HOT prune, I never did any updates, so I think there couldn't be
any HOT tuples.  Or does HOT prune do more than that?


Yes, HOT will also prune away DELETEd tuples. It will leave behind a 
dead line pointer, so it won't stop the table from growing if you 
repeatedly delete and insert, but it will slow it down significantly.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove upsert example from docs

2010-08-07 Thread Marko Tiikkaja

On 8/5/2010 9:44 PM, Merlin Moncure wrote:

On Thu, Aug 5, 2010 at 2:09 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

I was not persuaded that there's a real bug in practice.  IMO, his
problem was a broken trigger not broken upsert logic.  Even if we
conclude this is unsafe, simply removing the example is of no help to
anyone.


Well, the error handler is assuming that the unique_volation is coming
from the insert made within the loop.  This is obviously not a safe
assumption in an infinite loop context.  It should be double checking
where the error was being thrown from -- but the only way I can think
of to do that is to check sqlerrm.


Yeah, this is a known problem with our exception system.  If there was 
an easy and reliable way of knowing where the exception came from, I'm 
sure the example would include that.



Or you arguing that if you're
doing this, all dependent triggers must not throw unique violations up
the exception chain?


If he isn't, I am.  I'm pretty sure you can break every example in the 
docs with a trigger (or a rule) you haven't thought through.



A more useful response would be to supply a correct example.

Agree: I'd go further I would argue to supply both the 'safe' and
'high concurrency (with caveat)' way.  I'm not saying the example is
necessarily bad, just that it's maybe not a good thing to be pointing
as a learning example without qualifications.  Then you get a lesson
both on upsert methods and defensive error handling (barring
objection, I'll provide that).


The problem with the safe way is that it's not safe if called in a 
transaction with isolation level set to SERIALIZABLE.



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal / proof of concept: Triggers on VIEWs

2010-08-07 Thread Marko Tiikkaja

On 8/6/2010 10:49 AM, Dean Rasheed wrote:

On 4 August 2010 15:08, Marko Tiikkajamarko.tiikk...@cs.helsinki.fi  wrote:

I'm mainly concerned about concurrently running transactions.


Once again, I think I mis-understood your point. I think that the
database can't really lock anything before firing the trigger because
the view might contain grouping/aggregates or even not be based on any
real tables at all, so it would be impossible to work out what to
lock.


Right.


Thus it would be up to the trigger function to get this right.
In the simplest case, for a DELETE, this might look something like:

CREATE OR REPLACE FUNCTION instead_of_delete_trig_fn()
RETURNS trigger AS
$$
BEGIN
   DELETE FROM base_table WHERE pk = OLD.pk;
   IF NOT FOUND THEN RETURN NULL; END IF;

   RETURN OLD;
END;
$$
LANGUAGE plpgsql;

If 2 users try to delete the same row, the second would block until
the first user's transaction finished, and if the first user
committed, the second user's trigger would return NULL, which the
database would signal as no rows deleted.


The problem is that this isn't even nearly sufficient.  I gave this some 
more thought while I was away, and it seems that I missed at least one 
more important thing: the WHERE clause.  Imagine this query:


DELETE FROM view WHERE pk = 1 AND f1  0;

Now the trigger function gets called if the row where pk = 1, as seen by 
the query's snapshot, has f1  0.  But if a concurrent transaction sets 
f1 to 0 before the triggers gets to the row, you end up deleting a row 
that doesn't match the WHERE clause.  I have a few ideas on how this 
could be tackled, but I think we need to split these two threads.  I 
still think that having triggers on views without addressing these 
concurrency concerns is not a good idea, though.



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-07 Thread Pavel Stehule
Hello

2010/8/7 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/26 Robert Haas robertmh...@gmail.com:
 Come to think of it, have we checked that the behavior of LEFT, RIGHT,
 REVERSE, etc. is the same on other DBs, especially as far as nulls,
 empty strings, too-large or negative subscripts, etc is concerned?  Is
 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?

 I made a discussion page in wiki for the compatibility issue.
 http://wiki.postgresql.org/wiki/String_Functions_and_Operators_Compatibility


nice, thank you

 Please fill empty cells and fix wrong descriptions.
  * concat() is not compatible between MySQL and Oracle/DB2. Which do we buy?

I prefer a our implementation - it skip a NULL values and it has a
variadic arguments. MySQL's concat isn't too consistent - I don't know
why it has different NULL handlidg than concat_ws.

  * How do other databases behave in left() and right() with negative lengths?

I don't know about one with left() and right() functions. What I know,
only MS Access has these functions. The design of these functions is
inspirited by wide used a Oracle library PLvision - this library is
freeware now - but my code is original. See plvstr.left() and
plvstr.right() - and little bit by python substring operations. The
sense of negative arguments is elimination of necessary detoast
operations and utf8 related calculations. For right() it means skip
first n chars, for left() skip last n chars. These functions was
originally designed for contrib - and I still thinking so contrib is
better - My opinion isn't strong here - I prefer a fully functional
function in contrib before minimalistic version in core. Minimalistic
functions are trivial via substring.

  * Are there any databases that has similar features with format() or
 sprintf() ?

I know only about package from PLvision library -

select plvsubst.string('My name is %s %s', ARRAY['Pavel','Stěhule']);

but you can find a lot of custom implementations. I found a some
similar - not exactly this in T-SQL see FORMATMESSAGE() function. But
the using of this function is very limited and it is C API function
(available from T-SQL). It doesn't return a string, just write to log.



 And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

 I think we have no other choice but to use VARIADIC any for variadic
 functions.
 We have all combinations of argument types for || operator, (text, text),
 (text, any), (any, text), but we cannot use such codes for variadic functions
 -- they have no limits of argument numbers. And in the case, the functions
 should be STABLE because they convert arguments to text in it with typout
 functions that might be STABLE.


 IMHO, I'd repeat, syntax for format() is a bad choice because it cannot
 concatenate multiple arguments without separator, though RAISE also uses it.
 %s format in sprintf() or {n} syntax in C#'s String.Format() seems to be
 a better design.

I don't agree. This function isn't designed to replace string
concation. It is designed to build a SQL string (for dynamic SQL) or
format messages. It isn't designed to replace to_char function. It is
designed to work mainly inside PLpgSQL functions and then is
consistent with RAISE statement.

Thank you

Regards

Pavel Stehule


 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gincostestimate

2010-08-07 Thread Tom Lane
I wrote:
 1. The use of rd_amcache is very questionable.

Attached is an alternate patch that I think you should give serious
consideration to.  The basic idea here is to only update the metapage
stats data during VACUUM, and not bother with incremental updates during
other operations.  That gets rid of a lot of the complexity and
opportunities for bugs-of-omission in the original approach, and also
reduces contention for the metapage as well as WAL traffic.
gincostestimate can compensate fairly well for index growth since the
last VACUUM by scaling up the recorded values by the known growth ratio
of the overall index size.  (Note that the index-pages count passed to
gincostestimate is accurate, having been recently gotten from
RelationGetNumberOfBlocks.)  Of course, this is only approximate, but
considering that the equations the values are going to be fed into are
even more approximate, I don't see a problem with that.

I also dropped the use of rd_amcache, instead having ginGetStats()
just read the metapage every time.  Since the planner stats per se
are now only updated during vacuum, it would be reasonable to use
rd_amcache to remember them, but there's still a problem with
nPendingPages.  I think that keeping it simple is the way to go,
at least until someone can show a performance problem with this way.

I didn't do anything about the questionable equations in
gincostestimate.  Those need to either be fixed, or documented as
to why they're correct.  Other than that I think this could be
committed.

regards, tom lane

PS: I still haven't tested this further than running the regression
tests, since I see little point in trying to check its estimation
behavior until those equations are fixed.  However, the hstore
regression test did expose a core dump in gincostestimate (failing
to guard against null partial_matches), which I have fixed here.



bink8QUUkvuQI.bin
Description: gincostestimate-0.21.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-07 Thread Mike Fowler

On 06/08/10 20:55, Peter Eisentraut wrote:

On fre, 2010-08-06 at 09:04 +0100, Mike Fowler wrote:

If the patch is to be committed, does it make sense for me to refine
it such that it uses the new xpath internal function you extracted in
the xmlexists patch?


Yes, you can probably shrink this patch down to about 20 lines.



Updated the patch so that it will apply to head and re-worked the 
function to use the new xpath internal function.


Regards,

--
Mike Fowler
Registered Linux user: 379787
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8693,8698  SELECT xpath('//mydefns:b/text()', 'a xmlns=http://example.com;btest/b/a
--- 8693,8731 
  (1 row)
  ]]/screen
 /para
+ 
+sect3
+ titlexpath_exists/title
+ 
+ indexterm
+  primaryxpath_exists/primary
+ /indexterm
+ 
+ synopsis
+  functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional)
+ /synopsis
+ 
+ para
+  The function functionxpath_exists/function is a specialised form
+  of the functionxpath/function function. Though the functions are
+  syntactically the same the xpath expressions are evaluated in differing
+  contexts. Instead of returning the XML values that satisfy the xpath, this
+  function returns a boolean indicating whether the query was satisfied or not.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', 
+  ARRAY[ARRAY['my', 'http://example.com']]);
+ 
+  xpath_exists  
+ 
+  t
+ (1 row)
+ ]]/screen
+  /para
+ /sect3
/sect2
  
sect2 id=functions-xml-mapping
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3541,3543  Datum xmlexists(PG_FUNCTION_ARGS)
--- 3541,3567 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean. Differs from
+  * xmlexists as it supports namespaces and is not defined in SQL/XML.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	int			res_nitems;
+ 
+ 	xpath_internal(xpath_expr_text, data, namespaces,
+    res_nitems, NULL);
+ 
+ 	PG_RETURN_BOOL(res_nitems  0);
+ #else
+ 	NO_XML_SUPPORT();
+ 	return 0;
+ #endif
+ }
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***
*** 4390,4395  DESCR(evaluate XPath expression);
--- 4390,4400 
  DATA(insert OID = 2614 (  xmlexists	 PGNSP PGUID 12 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ xmlexists _null_ _null_ _null_ ));
  DESCR(test XML value against XPath expression);
  
+ DATA(insert OID = 3037 (  xpath_exists	 PGNSP PGUID 12 1 0 0 f f f t f i 3 0 16 25 142 1009 _null_ _null_ _null_ _null_ xpath_exists _null_ _null_ _null_ ));
+ DESCR(evaluate XPath expression in a boolean context, with namespaces support);
+ DATA(insert OID = 3038 (  xpath_exists	 PGNSP PGUID 14 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ select pg_catalog.xpath_exists($1, $2, ''{}''::pg_catalog.text[]) _null_ _null_ _null_ ));
+ DESCR(evaluate XPath expression in a boolean context);
+ 
  /* uuid */
  DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2950 2275 _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ ));
  DESCR(I/O);
*** a/src/include/utils/xml.h
--- b/src/include/utils/xml.h
***
*** 37,42  extern Datum texttoxml(PG_FUNCTION_ARGS);
--- 37,43 
  extern Datum xmltotext(PG_FUNCTION_ARGS);
  extern Datum xmlvalidate(PG_FUNCTION_ARGS);
  extern Datum xpath(PG_FUNCTION_ARGS);
+ extern Datum xpath_exists(PG_FUNCTION_ARGS);
  extern Datum xmlexists(PG_FUNCTION_ARGS);
  
  extern Datum table_to_xml(PG_FUNCTION_ARGS);
*** a/src/test/regress/expected/xml.out
--- b/src/test/regress/expected/xml.out
***
*** 502,507  SELECT xpath('//b', 'aone btwo/b three betc/b/a');
--- 502,560 
   {btwo/b,betc/b}
  (1 row)
  
+ -- Test xpath_exists evaluation
+ SELECT xpath_exists('//town[text() = ''Toronto'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
+  xpath_exists 
+ --
+  f
+ (1 row)
+ 
+ SELECT xpath_exists('//town[text() = ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
+  xpath_exists 
+ --
+  t
+ (1 row)
+ 
+ INSERT INTO xmltest VALUES (4, 'menubeersnameBudvar/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml);
+ INSERT INTO xmltest VALUES (5, 'menubeersnameMolson/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml);
+ INSERT INTO xmltest VALUES (6, 'myns:menu 

Re: [HACKERS] review: xml_is_well_formed

2010-08-07 Thread Mike Fowler

On 06/08/10 21:55, Peter Eisentraut wrote:

On fre, 2010-08-06 at 14:43 +0100, Mike Fowler wrote:

Or perhaps it could return a string instead of a boolean: content,
document, or NULL if it's neither.



I like the sound of that. In fact this helps workaround the IS
DOCUMENT
and IS CONTENT limitations such that you can you can select only
content, only documents or both is you use IS NOT NULL.

Unless anyone sees a reason that this function needs to remain a
boolean function, I'll rework the patch over the weekend.


What is the actual use case for this function?  Is the above behavior
actually useful?


The idea is to be able to filter a table that contains XML in TEXT that 
might not be well formed. Knowing that you're only dealing with well 
formed XML prevents you blowing up when you attempt the cast.




One reason to stick with boolean is backward compatibility.



To be honest I'm happiest with returning a boolean, even if there is 
some confusion over content only being valid. Though changing the return 
value to DOCUMENT/CONTENT/NULL makes things a touch more explicit, the 
same results can be achieved by simply running:


SELECT data::xml FROM mixed WHERE xml_is_well_formed(data) AND data::xml 
IS DOCUMENT;


Regards,
--
Mike Fowler
Registered Linux user: 379787

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Moderator on Committers?

2010-08-07 Thread Simon Riggs

I notice that there are many spam messages coming through on Committers.

That seems a little strange, since one of my commit messages has been
held for moderator approval. (Apparently the word sub just happened to
get wrapped into first byte position, and so has been confused with a
subscribe message).

Who is approving spam, yet refusing to permit messages from actual
committers to the commit list?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moderator on Committers?

2010-08-07 Thread Marc G. Fournier


In this case, was it email From @news.postgresql.org to @postgresql.org? 
If so, this is already been corrected ...


On Fri, 6 Aug 2010, Simon Riggs wrote:



I notice that there are many spam messages coming through on Committers.

That seems a little strange, since one of my commit messages has been
held for moderator approval. (Apparently the word sub just happened to
get wrapped into first byte position, and so has been confused with a
subscribe message).

Who is approving spam, yet refusing to permit messages from actual
committers to the commit list?

--
Simon Riggs   www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_transaction patch

2010-08-07 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@gmail.com writes:
 Accessor functions to get so far collected statistics for the current
 transaction
 https://commitfest.postgresql.org/action/patch_view?id=301

 The latest version of the patch works as expected, and also well-formed.
 I'll mark the patch to Ready for Committer.

I'm working through this patch now.  I kind of think that we ought to
drop the functions and view columns that claim to report live/dead
tuples.  In the first place, they're misnamed, because what they're
actually reporting is delta values (ie, new live tuples or new dead
tuples).  In the second place, they don't seem very useful.  The
live_tuples count is absolutely, positively guaranteed to read out as
zero, because a transaction that hasn't reached commit cannot have
created any known-live tuples.  The dead_tuples count can read out as
positive under certain circumstances, for example if a subtransaction
inserted some tuples and was then rolled back --- we know for certain
those tuples are dead and so the t_delta_dead_tuples count gets
incremented at subtransaction rollback.  But for the most part the
dead_tuples count is going to be a lot less than people might expect
based on what the transaction's done so far.

If we keep these we're going to have to document them a lot better
than the submitted patch does.  But I think we should just take 'em out.
Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising dead_tuple_count from pgstattuple

2010-08-07 Thread Gordon Shannon

I think this simple test highlights the question well.

--
create temporary table t(x int) with (autovacuum_enabled=off);
insert into t select x from generate_series(1,1,1)x;
vacuum verbose t;
select dead_tuple_count from pgstattuple('t');-- 0
delete from t where x = 100;
select dead_tuple_count from pgstattuple('t');-- 100
delete from t where x = 300;
select dead_tuple_count from pgstattuple('t');-- 200 (not 300)
vacuum verbose t;
vacuum verbose t;
select dead_tuple_count from pgstattuple('t');-- 0
drop table t;
--

When we get to the 2 vacuums at the end, I expect it to vacuum 300 rows, but
it only appears to vacuum 200...

vacuum verbose t;
psql:test:15: INFO:  vacuuming pg_temp_2.t
psql:test:15: INFO:  t: removed 200 row versions in 2 pages
psql:test:15: INFO:  t: found 200 removable, 9700 nonremovable row
versions in 45 out of 45 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
Time: 69.792 ms

Yet, the next vacuum reports that it found 300 unused item pointers. So they
were all vacuumed by somebody.

psql:test:16: INFO:  vacuuming pg_temp_2.t
psql:test:16: INFO:  t: found 0 removable, 7158 nonremovable row versions
in 33 out of 45 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 300 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
Time: 38.436 ms

So, I guess my real question here is, what happened to the missing 100
items?  If it was HOT prune, can anyone summarize what that does?

Thanks!
-- 
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Surprising-dead-tuple-count-from-pgstattuple-tp2266955p2267745.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] scheduling

2010-08-07 Thread Robert Haas
Do we have a projected data for the next 9.0 wrap, and will it be
beta5 or rc1?  How much should we worry about the remaining open
items?

http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items

I am inclined to say that all three of the items currently on the list
need to be addressed in some way before we move forward... or at least
the last two.

* ExplainOnePlan handles snapshots differently than ProcessQuery
* Backup procedure is wrong?
* Walreceiver crashes in AIX

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scheduling

2010-08-07 Thread Alex Hunsaker
On Sat, Aug 7, 2010 at 20:13, Robert Haas robertmh...@gmail.com wrote:
 Do we have a projected data for the next 9.0 wrap, and will it be
 beta5 or rc1?  How much should we worry about the remaining open
 items?

If we are taking a vote, I +1 for rc1 :).

 http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items

 I am inclined to say that all three of the items currently on the list
 need to be addressed in some way before we move forward... or at least
 the last two.

 * ExplainOnePlan handles snapshots differently than ProcessQuery

Isn't this one not specific to 9.0?  If so I don't think it should be
a blocker.

 * Backup procedure is wrong?

This *sounds* like a documentation bug, seems like something we can
fix while in -rc to me.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2010-08-07 Thread David E. Wheeler
On Aug 7, 2010, at 12:24 AM, Pavel Stehule wrote:

 try=# create or replace function try(bool) returns text language plperl AS 
 'shift';
 CREATE FUNCTION
 Time: 121.403 ms
 try=# select try(true);
  try
 -
  t
 (1 row)
 
 I wish this wasn't so.
 
 
 It must not be - it depends on PL handler implementation. PostgreSQL
 call PL handler with binary values. I am thinking so new Python PL can
 do it well.

I'm thinking an update to PL/Perl would be useful. Frankly, I'd most like to 
see proper array support. But that's another topic.

 Valid points. I agree that it would be nicer to use RECORDs:
 
SELECT foo( row('foo', 1), row('bar', true));
 
 I am not absolutly satisfied - but it's better, than arrays.

 Certainly much clearer. But given that we've gone round and round on 
 allowing non-C functions to use ROWs and gotten nowhere, I don't know that 
 we'll get any further now. But can you not create a C function that allows a 
 signature of VARIADIC RECORD?
 
 you can do a variadic over ROW type. We have not a polymorphic arrays
 - so isn't possible to write VARIADIC RECORD now.

Ah, right. I guess table types can't be cast to RECORD?

 It could be a nice
 if we are to define a own composite types with polymorphic fields.
 Then you can do:
 
 CREATE TYPE pair AS (key text, value any);
 CREATE FUNCTION foo(VARIADIC pair[])

Yes.

 other idea is leave arrays - and thinking about key, value collection
 as new kind of data types. so maybe
 
 CREATE FUNCTION foo(VARIADIC params COLECTION OF text WITH UNIQUE text KEY)

COLLECTION?

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers