date:20091124

Re: [HACKERS] Partitioning option for COPY

2009-11-24 Thread Hannu Krosing

On Mon, 2009-11-23 at 10:24 -0500, Emmanuel Cecchet wrote:

 But it looks like it is a waste of everybody's time to continue this 
 discussion further. Just move the patch to the rejected patches and 
 let's wait for Itagaki's implementation.

Emmanuel, please try to work together with Itagaki san on getting the
bigger vision implemented, as this is a thing that can benefit a lot
from more people who have taken the time to learn about the parts of
code involved.

Even though this patch will not get in, most of the effort in developing
it is not actual coding, but familiarizing yourself with the other code
involved. 

Coding actual patches should be easy once you know the code _and_ the
desired result. 

You probably already know a lot of what is required to help us to common
goal of a clean implementation of partitioning.

 -- 
 Emmanuel Cecchet
 Aster Data
 Web: http://www.asterdata.com
 
 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Hannu Krosing

On Mon, 2009-11-23 at 18:31 -0500, Greg Smith wrote:
 Robert Haas wrote:
  I'm fuzzy on what problem this is attempting to solve...  as mentioned
  in the above guidelines, it's usually good to start with some design
  discussions before writing/submitting code.
 This has been through some heavy design discussions with a few PG 
 hackers you know and some you don't, they just couldn't release the 
 result until now.  As for what it's good for, if you look at what you 
 can do now with dblink, you can easily move rows between nodes using 
 dblink_build_sql_insert.  This is perfectly fine for small bits of work, 
 but the performance isn't nearly good enough to do serious replication 
 with it.  The upper level patch here allows using COPY as the mechanism 
 to move things between them, which is much faster for some use cases 
 (which includes most of the really useful ones).  It dramatically 
 increases the scale of what you can move around using dblink as the 
 replication transport.
 
 The lower level patch is needed to build that layer, which is an 
 immediate proof of its utility.  In addition, adding a user-defined 
 function as a COPY target opens up all sorts of possibilities for things 
 like efficient ETL implementation.  And if this approach is accepted as 
 a reasonable one, as Dan suggested a next step might even be to 
 similarly allow passing COPY FROM through a UDF, which has the potential 
 to provide a new efficient implementation path for some of the custom 
 input filter requests that pop up here periodically.

Can this easily be extended to do things like

COPY stdin TO udf();
or
COPY udf() FROM stdin;

so that I could write a simple partitioning function, either local for
partitioned tables or using pl/proxy for partitioned databases

?

 -- 
 Greg Smith2ndQuadrant   Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com  www.2ndQuadrant.com
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning option for COPY

2009-11-24 Thread Itagaki Takahiro


Hannu Krosing ha...@2ndquadrant.com wrote:

 Even though this patch will not get in, most of the effort in developing
 it is not actual coding, but familiarizing yourself with the other code
 involved. 

I just edited a wiki page for this discussion.
I hope it can be a help.
http://wiki.postgresql.org/wiki/Table_partitioning

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Hannu Krosing

On Mon, 2009-11-23 at 16:25 -0800, Daniel Farina wrote:
 On Mon, Nov 23, 2009 at 4:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  pgsql-hackers had some preliminary discussions a couple months back
  on refactoring COPY to allow things like this --- see the thread
  starting here:
  http://archives.postgresql.org/pgsql-hackers/2009-09/msg00486.php
  While I don't think we arrived at any final decisions, I would like
  to know how this design fits in with what was discussed then.
 
 This seems to be about importing/ingress, whereas this patch is about
 exporting/egress...it is an interesting question on how much parsing
 to do before on the ingress side before handing a row to a function
 though, 

My suggestion for 

COPY func(rowtype) FROM stdin;

would be to pass the function a fully processed row of that type with
all fields resolved and converted to right types.

it may be useful to also have forms like

COPY func(text) FROM stdin;

and

COPY func(bytea[]) FROM stdin;

for getting a less processed input

 should we try to make these kinds of operations a bit more
 symmetrical.



-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Hannu Krosing

On Sun, 2009-11-22 at 18:51 -0500, Tom Lane wrote:
 Craig Ringer cr...@postnewspapers.com.au writes:
  I do think this comes up often enough that a built-in trigger update
  named column with result of expression on insert trigger might be
  desirable.
 
 There's something of the sort in contrib already, I believe, though
 it's so old it still uses abstime :-(

What's wrong with abstime ?

it is valid for timestamps up to 2038-01-19 and it's on-disk size
smaller than other timestamp options


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 12:29 AM, Hannu Krosing ha...@krosing.net wrote:
 COPY stdin TO udf();

If stdin becomes (is?) a legitimate source of records, then this patch
will Just Work.

The patch is already quite useful in the COPY (SELECT ...) TO FUNCTION
... scenario.

 COPY udf() FROM stdin;

This is unaddressed, but I think it would be a good idea to consider
enabling this kind of thing prior to application.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 12:38 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 COPY func(rowtype) FROM stdin;

I didn't consider rowtype...I did consider a type list, such as:

COPY func(typea, typeb, typec) FROM ...

Which would then operate just like a table, but be useless for the
data-cleaning case, and would not allow type overloading to be the
mechanism of selecting behavior.

It was just an idea...I don't really know the use cases well enough,
my anticipated used case was updating eagerly materialized quantities
with the UDF.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Thom Brown

2009/11/24 Hannu Krosing ha...@2ndquadrant.com

 On Sun, 2009-11-22 at 18:51 -0500, Tom Lane wrote:
  Craig Ringer cr...@postnewspapers.com.au writes:
   I do think this comes up often enough that a built-in trigger update
   named column with result of expression on insert trigger might be
   desirable.
 
  There's something of the sort in contrib already, I believe, though
  it's so old it still uses abstime :-(

 What's wrong with abstime ?

 it is valid for timestamps up to 2038-01-19 and it's on-disk size
 smaller than other timestamp options


But it's very very deprecated and could be removed at any time.  It's been
so for years now, and I wouldn't want to *start* using something which is
deprecated.

Thom

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Pavel Stehule

Hello

I thing, so this patch is maybe good idea. I am missing better
function specification. Specification by name isn't enough - we can
have a overloaded functions. This syntax doesn't allow to use explicit
cast - from my personal view, the syntax is ugly - with type
specification we don't need to keyword FUNCTION

what about::

var a) by type specification

COPY table TO foo(varchar, int)

var b) by value expression - it allows some changes in order, casting, constants

COPY table TO foo($3, $1::date, $2::text, CURRENT_DATE, true);

One question:
We have a fast copy statement - ok., we have a fast function ok, but
inside a function we have to call slow sql query. Personally What is
advantage?

We need pipes like

like COPY table TO foo(..) TO table

foo() should be a transformation function, or real pipe function

Regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning option for COPY

2009-11-24 Thread Simon Riggs

On Tue, 2009-11-24 at 17:30 +0900, Itagaki Takahiro wrote:
 Hannu Krosing ha...@2ndquadrant.com wrote:
 
  Even though this patch will not get in, most of the effort in developing
  it is not actual coding, but familiarizing yourself with the other code
  involved. 
 
 I just edited a wiki page for this discussion.
 I hope it can be a help.
 http://wiki.postgresql.org/wiki/Table_partitioning
 

Good job. Looks like a clear path forwards to me.

I've made a couple of minor clarifications.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 I thing, so this patch is maybe good idea. I am missing better
 function specification. Specification by name isn't enough - we can
 have a overloaded functions. This syntax doesn't allow to use explicit
 cast - from my personal view, the syntax is ugly - with type
 specification we don't need to keyword FUNCTION

As long as things continue to support the INTERNAL-type behavior for
extremely low overhead bulk transfers I am open to suggestions about
how to enrich things...but how would I do so under this proposal?

I am especially fishing for suggestions in the direction of managing
state for the function between rows though...I don't like how the
current design seems to scream use a global variable.

 We have a fast copy statement - ok., we have a fast function ok, but
 inside a function we have to call slow sql query. Personally What is
 advantage?

The implementation here uses a type 'internal' for performance.  It
doesn't even recompute the fcinfo because of the very particular
circumstances of how the function is called.  It doesn't do a memory
copy of the argument buffer either, to the best of my knowledge.  In
the dblink patches you basically stream directly from the disk, format
the COPY bytes, and shove it into a waiting COPY on another postgres
node...there's almost no additional work in-between.  All utilized
time would be some combination of the normal COPY byte stream
generation and libpq.

This, of course, presumes that everyone who is interested in building
on this is going to use some UDFs written in C...


 We need pipes like

 like COPY table TO foo(..) TO table

 foo() should be a transformation function, or real pipe function

I've actually considered this pipe thing with a colleague while
driving home from work...it occurred to us that it would be nice to
have both pipes and tees (basically composition vs. mapping
application of functions over the input) in some form.  Not sure what
an elegant way to express that is or how to control it.  Since you can
work around this by composing or applying functions on your own in
another function, I'm not sure if that's as high priority for me
personally.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Hannu Krosing

On Tue, 2009-11-24 at 02:37 -0800, Daniel Farina wrote:
 On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
  Hello
 
  I thing, so this patch is maybe good idea. I am missing better
  function specification. Specification by name isn't enough - we can
  have a overloaded functions. This syntax doesn't allow to use explicit
  cast - from my personal view, the syntax is ugly - with type
  specification we don't need to keyword FUNCTION
 
 As long as things continue to support the INTERNAL-type behavior for
 extremely low overhead bulk transfers I am open to suggestions about
 how to enrich things...but how would I do so under this proposal?
 
 I am especially fishing for suggestions in the direction of managing
 state for the function between rows though...I don't like how the
 current design seems to scream use a global variable.

Can't you use existing aggregate function design ?

CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
SFUNC = sfunc,
STYPE = state_data_type
[ , FINALFUNC = ffunc ]
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
)

and maybe use additional INITFUNC=, if you need it for dblink type
things which don't do connection management it automatically like
pl/proxy does.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 2:50 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Can't you use existing aggregate function design ?

 CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
    SFUNC = sfunc,
    STYPE = state_data_type
    [ , FINALFUNC = ffunc ]
    [ , INITCOND = initial_condition ]
    [ , SORTOP = sort_operator ]
 )

Actually, yes.  I just thought that this was an idea so crazy that no
one would like it.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 2:51 AM, Daniel Farina drfar...@gmail.com wrote:
 On Tue, Nov 24, 2009 at 2:50 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Can't you use existing aggregate function design ?

 CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
    SFUNC = sfunc,
    STYPE = state_data_type
    [ , FINALFUNC = ffunc ]
    [ , INITCOND = initial_condition ]
    [ , SORTOP = sort_operator ]
 )

 Actually, yes.  I just thought that this was an idea so crazy that no
 one would like it.

Oh, and the other elephant in the room: error handling.  How to handle
error conditions...try/catch/finally type stuff.  Aggregates do not
necessarily provide a slot for this one.  I did consider using
aggregates though, but somehow it felt to me like I need at least a
three-tuple, why not fish around for any random bundling of three
functions...

After all, I would not want to actually call the nodeAgg stuff to
apply the function anyway...so it'd basically be abused as a
three-tuple of functions.

Also, what if you wanted, say, replace the mechanism for COPY TO
'file'?  It'd be nice to make the following interaction (which uses
some implied global variables) not use such global variables:

BEGIN;
select open_file('/tmp/file', 'w+');
copy foo to function write_to_file;
-- what happens here if COPY aborts?  Does the transaction being in
the error state mean that files will not get closed?
select close_file();
COMMIT;

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Hannu Krosing

On Tue, 2009-11-24 at 09:46 +, Thom Brown wrote:
 2009/11/24 Hannu Krosing ha...@2ndquadrant.com
 On Sun, 2009-11-22 at 18:51 -0500, Tom Lane wrote:
  Craig Ringer cr...@postnewspapers.com.au writes:
   I do think this comes up often enough that a built-in
 trigger update
   named column with result of expression on insert trigger
 might be
   desirable.
 
  There's something of the sort in contrib already, I believe,
 though
  it's so old it still uses abstime :-(
 
 
 What's wrong with abstime ?
 
 it is valid for timestamps up to 2038-01-19 and it's on-disk
 size
 smaller than other timestamp options
 
 
 But it's very very deprecated and could be removed at any time.  It's
 been so for years now, and I wouldn't want to *start* using something
 which is deprecated.
 
 Thom

I'd expect it to have an afterlife as a separately maintained type
somewhere for those who care about data sizes, similar other space
savers like ip4 type.


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Hannu Krosing

On Tue, 2009-11-24 at 02:56 -0800, Daniel Farina wrote:
 On Tue, Nov 24, 2009 at 2:51 AM, Daniel Farina drfar...@gmail.com wrote:
  On Tue, Nov 24, 2009 at 2:50 AM, Hannu Krosing ha...@2ndquadrant.com 
  wrote:
  Can't you use existing aggregate function design ?
 
  CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
 SFUNC = sfunc,
 STYPE = state_data_type
 [ , FINALFUNC = ffunc ]
 [ , INITCOND = initial_condition ]
 [ , SORTOP = sort_operator ]
  )
 
  Actually, yes.  I just thought that this was an idea so crazy that no
  one would like it.

seems kind of natural choice for me - in essence this is an aggregate
function, aggregating over rows/tuples supplied to it.

 Oh, and the other elephant in the room: error handling.  How to handle
 error conditions...try/catch/finally type stuff. 

Same as current aggregates - either ignore the error, logi it and
continue, or bail out

  Aggregates do not necessarily provide a slot for this one. 

Neither do ordinary funtions, we have no ON ERROR DO ... clause  for
function definitions

 I did consider using
 aggregates though, but somehow it felt to me like I need at least a
 three-tuple, why not fish around for any random bundling of three
 functions...

Why do you need three ? 

 After all, I would not want to actually call the nodeAgg stuff to
 apply the function anyway...so it'd basically be abused as a
 three-tuple of functions.

Actually it would be best if it could use straight generic funtions, so
you could do something like

COPY stdin TO filterfunc(int) TO avg(int);

You can bypass using nodeAgg in your own C functions as an optimisation.

 Also, what if you wanted, say, replace the mechanism for COPY TO
 'file'?  It'd be nice to make the following interaction (which uses
 some implied global variables) not use such global variables:
 
 BEGIN;
 select open_file('/tmp/file', 'w+');
 copy foo to function write_to_file;
 -- what happens here if COPY aborts?  Does the transaction being in
 the error state mean that files will not get closed?
 select close_file();
 COMMIT;

pass the file name in as an argument to SFUNC, open it on first call,
ignore later (if it stays the same ;)

for foreign connections use SQL-MED and pass the handle to foreign
data


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 3:25 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 On Tue, 2009-11-24 at 02:56 -0800, Daniel Farina wrote:
 On Tue, Nov 24, 2009 at 2:51 AM, Daniel Farina drfar...@gmail.com wrote:
  On Tue, Nov 24, 2009 at 2:50 AM, Hannu Krosing ha...@2ndquadrant.com 
  wrote:
  Can't you use existing aggregate function design ?
 
  CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
     SFUNC = sfunc,
     STYPE = state_data_type
     [ , FINALFUNC = ffunc ]
     [ , INITCOND = initial_condition ]
     [ , SORTOP = sort_operator ]
  )
 
  Actually, yes.  I just thought that this was an idea so crazy that no
  one would like it.

 seems kind of natural choice for me - in essence this is an aggregate
 function, aggregating over rows/tuples supplied to it.

Okay, well, maybe that wasn't such a crazy idea after all...

 Oh, and the other elephant in the room: error handling.  How to handle
 error conditions...try/catch/finally type stuff.

 Same as current aggregates - either ignore the error, logi it and
 continue, or bail out
[snip]
 Neither do ordinary funtions, we have no ON ERROR DO ... clause  for
 function definitions

It is assumed most functions do not have side effects outside the
database, so this is gotten rather for free.  The driving use case for
this *is* side effects on other systems.  I'm not sure if it's as easy
to use this justification here...normally rollbacks just take care of
all the error handling a function would want.  Here I'm not so sure
that is as common a case.


 I did consider using
 aggregates though, but somehow it felt to me like I need at least a
 three-tuple, why not fish around for any random bundling of three
 functions...

 Why do you need three ?

I'm counting the aggregate prototype itself to refer to the bundle,
which I suppose would be more normally considered a two-tuple of
functions.  This is a self-referential tuple, I suppose...

 After all, I would not want to actually call the nodeAgg stuff to
 apply the function anyway...so it'd basically be abused as a
 three-tuple of functions.

 Actually it would be best if it could use straight generic funtions, so
 you could do something like

 COPY stdin TO filterfunc(int) TO avg(int);

Generic functions?  Do you mean just scalar functions?  That'd be
neat, but as I said previously, composition could just be wrapped into
a function of the user's choice.  Also, what about use of
multi-function-apply?

COPY stdin TO replicant1(datum) AND replicant2(datum);

You could imagine all sorts of new 2PC evil.  But again, one could
just write a little function to absorb the rows and dole them out
without bloating COPY syntax...

I am in no way suggesting that syntax seriously or unseriously.

 pass the file name in as an argument to SFUNC, open it on first call,
 ignore later (if it stays the same ;)

So either you are going to pass it with every row and ignore it, or
create a new initial aggregate state for each COPY TO FUNCTION...how
are you going to get it passed to SFUNC?

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] KNNGiST for knn-search

2009-11-24 Thread Teodor Sigaev


I think you'll need to work on that. A WHERE qual shouldn't imply a sort
order. You'll have to teach the planner how to use the index to speed up
a query in the first form.

Of course, right now it is a working prototype.


1. KNNGiST is about 5% slower than GiST on non-knn search queries, like
  contains or contained by, because of some overhead of new algorithm of
  tree traversal


Is it possible to use the regular GiST traversal algorithm on a
KNNGiST-tree, when performing regular GiST searches that don't require a
particular order?
New algorithm works much more with memory for allocation/free to manage lists 
and it's a single reason of performance loss. Choosing of algorithm could not be 
done by consistent function, it should be done at least in amrescan method or 
even earlier - in planner.






2.  KNNGiST can't be used in  bitmap index scan, which destroys order of
results,
  We don't know the way to forbid bitmap index scan only for knn queries.
  Current version of KNNGiST doesn't distinguish knn-search and usual
search
  and postgres doesn't know about ordered output from KNNGiST.


Yeah, you really need to modify the planner to understand the ordering
and plan accordingly.

Hmm, I thought about it, but still have no a good idea.
One idea:
SELECT p FROM pt WHERE p  '5.0,5.0'::point ORDER BY (p - '5.0,5.0'::point) 
DESC LIMIT 10;
And add - to opclass (but for now any indexable operation should return 
boolean type). Of course, KNNGiST should be modified to support not only 
k-nearest search but k-farest search and NULLS LAST/FIRST.


Not very convenient, because it's needed to look into expression of ORDER BY. 
And now you can specify p  'one point' AND p  'another point', but it's 
impossible to do that by ORDER BY clause.


Second idea with non-standard syntax.
SELECT ... ORDER BY PROXIMITY OF expression[, expression [..]] TO expression[, 
expression [..]] USING [operator [, operator [..]]
and operator is distance operator, i.e. it's not a member of btree opclass, but 
returns non-negative float8 value.


Without index it will be essentially the same as
ORDER BY expression operator expression[ + ..] DESC NULLS LAST


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] enable-thread-safety defaults?

2009-11-24 Thread Magnus Hagander

On Sat, Nov 21, 2009 at 08:29, Magnus Hagander mag...@hagander.net wrote:
 2009/11/20 Peter Eisentraut pete...@gmx.net:
 On fre, 2009-11-20 at 08:39 +0100, Magnus Hagander wrote:
 2009/11/20 Peter Eisentraut pete...@gmx.net:
  On fre, 2009-11-20 at 02:41 +0100, Magnus Hagander wrote:
  Is there any actual reason why we are building without thread safety
  by default on most platforms?
 
  Consistent defaults on all platforms?

 So why do we have largefile enabled by default? And zlib? And readline?

 Let me be more verbose:  I would assume that we want the configure
 defaults to be the same on all platforms.  We fail by default, for
 example, if zlib and readline are not there, but you can turn them off
 explicitly.  If we turn thread-safety on by default, we will/should fail
 if thread-safety is not supported, requiring the user to turn it off
 explicitly.

 Yes, of course. Silently turning it off would be a really really bad idea.

 If enough platforms don't support thread-safety, this could
 become annoying.

 Agreed.


 I don't have a good overview over how many platforms would be affected,
 and I could in general support changing the default, but I'm just laying
 down one possible constraint.

 Well, the buildfarm would tell us that, no? :)

ISTM that it should be as simple as the attached patch. Seems to work
for me :-) But I'm no autoconf guru, so maybe I missed something?
Comments? If not, how about we put this on HEAD and let the buildfarm
tell us how bad an idea it was?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
*** a/configure.in
--- b/configure.in
***
*** 558,569  IFS=$ac_save_IFS
  # Enable thread-safe client libraries
  #
  AC_MSG_CHECKING([allow thread-safe client libraries])
- if test $PORTNAME != win32; then
- PGAC_ARG_BOOL(enable, thread-safety, no, [make client libraries thread-safe])
- else
- # Win32 should always use threads
  PGAC_ARG_BOOL(enable, thread-safety, yes, [make client libraries thread-safe])
- fi
  
  PGAC_ARG_BOOL(enable, thread-safety-force, no, [force thread-safety despite thread test failure])
  if test $enable_thread_safety = yes -o \
--- 558,564 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] KNNGiST for knn-search

2009-11-24 Thread Heikki Linnakangas

Teodor Sigaev wrote:
 1. KNNGiST is about 5% slower than GiST on non-knn search queries, like
   contains or contained by, because of some overhead of new algorithm of
   tree traversal

 Is it possible to use the regular GiST traversal algorithm on a
 KNNGiST-tree, when performing regular GiST searches that don't require a
 particular order?
 New algorithm works much more with memory for allocation/free to manage
 lists and it's a single reason of performance loss. Choosing of
 algorithm could not be done by consistent function, it should be done at
 least in amrescan method or even earlier - in planner.

Ok, that sounds good. The bottom line is that you can use the same
on-disk tree with both algorithms. No need for a separate indexam in
that case.

 One idea:
 SELECT p FROM pt WHERE p  '5.0,5.0'::point ORDER BY (p -
 '5.0,5.0'::point) DESC LIMIT 10;
 And add - to opclass (but for now any indexable operation should
 return boolean type).

You really shouldn't need to have a WHERE clause.

 Of course, KNNGiST should be modified to support
 not only k-nearest search but k-farest search and NULLS LAST/FIRST.

Well, as long as the planner knows the capabilities of the indexam, it
can just fall back to a seqscan+sort if the query can't be sped up with
the index.

 And now you can specify p  'one point' AND p  'another
 point', but it's impossible to do that by ORDER BY clause.

Huh, what does that mean? Is it like ORDER BY (min( p  'one point', p
 'another point') ?

 Second idea with non-standard syntax.
 SELECT ... ORDER BY PROXIMITY OF expression[, expression [..]] TO
 expression[, expression [..]] USING [operator [, operator [..]]
 and operator is distance operator, i.e. it's not a member of btree
 opclass, but returns non-negative float8 value.
 
 Without index it will be essentially the same as
 ORDER BY expression operator expression[ + ..] DESC NULLS LAST

We already have the syntax to represent the query, using ORDER BY. IMHO
we just need to teach the planner that when it sees a query like that,
it can use a GiST index to speed it up. A number of indexam and operator
class API changes are probably required, but it should be invisible to
the user.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Pavel Stehule

2009/11/24 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 I thing, so this patch is maybe good idea. I am missing better
 function specification. Specification by name isn't enough - we can
 have a overloaded functions. This syntax doesn't allow to use explicit
 cast - from my personal view, the syntax is ugly - with type
 specification we don't need to keyword FUNCTION

 As long as things continue to support the INTERNAL-type behavior for
 extremely low overhead bulk transfers I am open to suggestions about
 how to enrich things...but how would I do so under this proposal?


using an INTERNAL type is wrong. It breaks design these functions for
usual PL. I don't see any reason, why it's necessary.

 I am especially fishing for suggestions in the direction of managing
 state for the function between rows though...I don't like how the
 current design seems to scream use a global variable.

 We have a fast copy statement - ok., we have a fast function ok, but
 inside a function we have to call slow sql query. Personally What is
 advantage?

 The implementation here uses a type 'internal' for performance.  It
 doesn't even recompute the fcinfo because of the very particular
 circumstances of how the function is called.  It doesn't do a memory
 copy of the argument buffer either, to the best of my knowledge.  In
 the dblink patches you basically stream directly from the disk, format
 the COPY bytes, and shove it into a waiting COPY on another postgres
 node...there's almost no additional work in-between.  All utilized
 time would be some combination of the normal COPY byte stream
 generation and libpq.


I understand and I dislike it. This design isn't general - or it is
far from using a function. It doesn't use complete FUNCAPI interface.
I thing so you need different semantic. You are not use a function.
You are use some like stream object. This stream object can have a
input, output function, and parameters should be internal (I don't
thing, so internal could to carry any significant performance here) or
standard. Syntax should be similar to CREATE AGGREGATE.

then syntax should be:

COPY table TO streamname(parameters)

COPY table TO filestream('/tmp/foo.dta') ...
COPY table TO dblinkstream(connectionstring) ...

This design is only ideas. It's not important.

What is important - limited design. There are not possible to use PL
mainly untrusted PL. Using an internal type is simple hack.

Pavel

 This, of course, presumes that everyone who is interested in building
 on this is going to use some UDFs written in C...


 We need pipes like

 like COPY table TO foo(..) TO table

 foo() should be a transformation function, or real pipe function

 I've actually considered this pipe thing with a colleague while
 driving home from work...it occurred to us that it would be nice to
 have both pipes and tees (basically composition vs. mapping
 application of functions over the input) in some form.  Not sure what
 an elegant way to express that is or how to control it.  Since you can
 work around this by composing or applying functions on your own in
 another function, I'm not sure if that's as high priority for me
 personally.

 fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2009/11/24 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 I thing, so this patch is maybe good idea. I am missing better
 function specification. Specification by name isn't enough - we can
 have a overloaded functions. This syntax doesn't allow to use explicit
 cast - from my personal view, the syntax is ugly - with type
 specification we don't need to keyword FUNCTION

 As long as things continue to support the INTERNAL-type behavior for
 extremely low overhead bulk transfers I am open to suggestions about
 how to enrich things...but how would I do so under this proposal?


 using an INTERNAL type is wrong. It breaks design these functions for
 usual PL. I don't see any reason, why it's necessary.

 I am especially fishing for suggestions in the direction of managing
 state for the function between rows though...I don't like how the
 current design seems to scream use a global variable.

 We have a fast copy statement - ok., we have a fast function ok, but
 inside a function we have to call slow sql query. Personally What is
 advantage?

 The implementation here uses a type 'internal' for performance.  It
 doesn't even recompute the fcinfo because of the very particular
 circumstances of how the function is called.  It doesn't do a memory
 copy of the argument buffer either, to the best of my knowledge.  In
 the dblink patches you basically stream directly from the disk, format
 the COPY bytes, and shove it into a waiting COPY on another postgres
 node...there's almost no additional work in-between.  All utilized
 time would be some combination of the normal COPY byte stream
 generation and libpq.


 I understand and I dislike it. This design isn't general - or it is
 far from using a function. It doesn't use complete FUNCAPI interface.
 I thing so you need different semantic. You are not use a function.
 You are use some like stream object. This stream object can have a
 input, output function, and parameters should be internal (I don't
 thing, so internal could to carry any significant performance here) or
 standard. Syntax should be similar to CREATE AGGREGATE.

I think you might be right about this.  At the time I was too shy to
add a DDL command for this hack, though.  But what I did want is a
form of currying, and that's not easily accomplished in SQL without
extension...

 then syntax should be:

 COPY table TO streamname(parameters)

 COPY table TO filestream('/tmp/foo.dta') ...
 COPY table TO dblinkstream(connectionstring) ...

I like this one quite a bit...it's a bit like an aggregate, except the
initial condition can be set in a rather function-callish way.

But that does seem to require making a DDL command, which leaves a
nice green field.  In particular, we could then make as many hooks,
flags, and options as we wanted, but sometimes there is a paradox of
choice...I just did not want to anticipate on Postgres being friendly
to a new DDL command when writing this the first time.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Hannu Krosing

On Tue, 2009-11-24 at 03:48 -0800, Daniel Farina wrote:
 On Tue, Nov 24, 2009 at 3:25 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
  On Tue, 2009-11-24 at 02:56 -0800, Daniel Farina wrote:
  On Tue, Nov 24, 2009 at 2:51 AM, Daniel Farina drfar...@gmail.com wrote:
   On Tue, Nov 24, 2009 at 2:50 AM, Hannu Krosing ha...@2ndquadrant.com 
   wrote:
   Can't you use existing aggregate function design ?
  
   CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
  SFUNC = sfunc,
  STYPE = state_data_type
  [ , FINALFUNC = ffunc ]
  [ , INITCOND = initial_condition ]
  [ , SORTOP = sort_operator ]
   )
  
   Actually, yes.  I just thought that this was an idea so crazy that no
   one would like it.
 
  seems kind of natural choice for me - in essence this is an aggregate
  function, aggregating over rows/tuples supplied to it.
 
 Okay, well, maybe that wasn't such a crazy idea after all...
 
  Oh, and the other elephant in the room: error handling.  How to handle
  error conditions...try/catch/finally type stuff.
 
  Same as current aggregates - either ignore the error, logi it and
  continue, or bail out
 [snip]
  Neither do ordinary funtions, we have no ON ERROR DO ... clause  for
  function definitions
 
 It is assumed most functions do not have side effects outside the
 database, so this is gotten rather for free.  The driving use case for
 this *is* side effects on other systems.  I'm not sure if it's as easy
 to use this justification here...normally rollbacks just take care of
 all the error handling a function would want.  Here I'm not so sure
 that is as common a case.

A cleaner solution for undoing external effects would be ON ROLLBACK
trigger, or maybe even extension to BEGIN 

BEGIN WORK ON ROLLBACK RUN externalCleanupFunction();

ROLLBACK trigger could also be done as SET parameter inside a session,
so it wont bloat/pollute system tables if changed often;

 
  I did consider using
  aggregates though, but somehow it felt to me like I need at least a
  three-tuple, why not fish around for any random bundling of three
  functions...
 
  Why do you need three ?
 
 I'm counting the aggregate prototype itself to refer to the bundle,
 which I suppose would be more normally considered a two-tuple of
 functions.  This is a self-referential tuple, I suppose...
 
  After all, I would not want to actually call the nodeAgg stuff to
  apply the function anyway...so it'd basically be abused as a
  three-tuple of functions.
 
  Actually it would be best if it could use straight generic funtions, so
  you could do something like
 
  COPY stdin TO filterfunc(int) TO avg(int);
 
 Generic functions?  Do you mean just scalar functions? 

Type. Actually I meant our existing aggregate functions.

  That'd be
 neat, but as I said previously, composition could just be wrapped into
 a function of the user's choice.  Also, what about use of
 multi-function-apply?
 
 COPY stdin TO replicant1(datum) AND replicant2(datum);

seems like a rare case, but you could use a wrapper func

CREATE FUNCTION replicants_1_and_2(datum) AS
   replicant1(datum)
   replicant2(datum)

 You could imagine all sorts of new 2PC evil. 

2PC is evil enyway, at least when performance is concerned ;)

  But again, one could
 just write a little function to absorb the rows and dole them out
 without bloating COPY syntax...
 
 I am in no way suggesting that syntax seriously or unseriously.
 
  pass the file name in as an argument to SFUNC, open it on first call,
  ignore later (if it stays the same ;)
 
 So either you are going to pass it with every row and ignore it,

That would be my preferred way, yes

 or create a new initial aggregate state for each COPY TO FUNCTION

third, more hackish way would to set it as INITCOND = '/file/name' :)

 ...how are you going to get it passed to SFUNC?

keep the file handle in the aggregate node - it is for keeping state,
and file handle sure is part of state.


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Pavel Stehule

2009/11/24 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 2009/11/24 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 I thing, so this patch is maybe good idea. I am missing better
 function specification. Specification by name isn't enough - we can
 have a overloaded functions. This syntax doesn't allow to use explicit
 cast - from my personal view, the syntax is ugly - with type
 specification we don't need to keyword FUNCTION

 As long as things continue to support the INTERNAL-type behavior for
 extremely low overhead bulk transfers I am open to suggestions about
 how to enrich things...but how would I do so under this proposal?


 using an INTERNAL type is wrong. It breaks design these functions for
 usual PL. I don't see any reason, why it's necessary.

 I am especially fishing for suggestions in the direction of managing
 state for the function between rows though...I don't like how the
 current design seems to scream use a global variable.

 We have a fast copy statement - ok., we have a fast function ok, but
 inside a function we have to call slow sql query. Personally What is
 advantage?

 The implementation here uses a type 'internal' for performance.  It
 doesn't even recompute the fcinfo because of the very particular
 circumstances of how the function is called.  It doesn't do a memory
 copy of the argument buffer either, to the best of my knowledge.  In
 the dblink patches you basically stream directly from the disk, format
 the COPY bytes, and shove it into a waiting COPY on another postgres
 node...there's almost no additional work in-between.  All utilized
 time would be some combination of the normal COPY byte stream
 generation and libpq.


 I understand and I dislike it. This design isn't general - or it is
 far from using a function. It doesn't use complete FUNCAPI interface.
 I thing so you need different semantic. You are not use a function.
 You are use some like stream object. This stream object can have a
 input, output function, and parameters should be internal (I don't
 thing, so internal could to carry any significant performance here) or
 standard. Syntax should be similar to CREATE AGGREGATE.

 I think you might be right about this.  At the time I was too shy to
 add a DDL command for this hack, though.  But what I did want is a
 form of currying, and that's not easily accomplished in SQL without
 extension...


COPY is a PostgreSQL extension. If there are other related extensions - why not?
PostgreSQL has lot of database objects over SQL standard - see
fulltext implementation. I am not sure if STREAM is good keyword now.
It could be in collision with STREAM from streaming databases.

 then syntax should be:

 COPY table TO streamname(parameters)

 COPY table TO filestream('/tmp/foo.dta') ...
 COPY table TO dblinkstream(connectionstring) ...

 I like this one quite a bit...it's a bit like an aggregate, except the
 initial condition can be set in a rather function-callish way.

 But that does seem to require making a DDL command, which leaves a
 nice green field.  In particular, we could then make as many hooks,
 flags, and options as we wanted, but sometimes there is a paradox of
 choice...I just did not want to anticipate on Postgres being friendly
 to a new DDL command when writing this the first time.


sure - nobody like too much changes in gram.y. But well designed
general feature with related  SQL enhancing is more acceptable, then
fast  simply hack. Don't be a hurry. This idea is good - but it needs:

a) good designed C API  like:

   initialise_functions(fcinfo) -- std fcinfo
   consument_process_tuple(fcinfo) -- gets standard row -- Datum
dvalues[] + Row description
   producent_process_tuple(fcinfo) -- returns standard row  -- Datum
dvalues[] + Row description (look on SRF API)
   terminate_funnction(fcinfo)

I am sure, so this could be similar to AGGREGATE api
+ some samples to contrib

b) good designed PLPerlu and PLPythonu interface
+ some samples to documentation

Regards
Pavel Stehule




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Hannu Krosing

On Tue, 2009-11-24 at 05:00 -0800, Daniel Farina wrote:
 On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:

  then syntax should be:
 
  COPY table TO streamname(parameters)
 
  COPY table TO filestream('/tmp/foo.dta') ...
  COPY table TO dblinkstream(connectionstring) ...

You probably meant

COPY table TO dblinkstream(connectionstring, table)

?

 I like this one quite a bit...it's a bit like an aggregate, except the
 initial condition can be set in a rather function-callish way.
 
 But that does seem to require making a DDL command, which leaves a
 nice green field. 

not necessarily DDL, maybe just a copystream type and a set of
functions creating objects of that type.

if you make it a proper type with input and output function, then you
can probably use it in statements like this

COPY table TO (select stream::copystream from streams where id = 7);

COPY table TO 'file:/tmp/outfile':: copystream;

COPY table TO 'dblink::connectstring':: copystream;

 In particular, we could then make as many hooks,
 flags, and options as we wanted, but sometimes there is a paradox of
 choice...I just did not want to anticipate on Postgres being friendly
 to a new DDL command when writing this the first time.

fulltext lived for quite some time as set of types and functions before
it was glorified with its own DDL syntax.

It may be good to have the same approach here - do it as a set of types
and functions first, think about adding DDL once it has stabilised
enough


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] enable-thread-safety defaults?

2009-11-24 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 ISTM that it should be as simple as the attached patch. Seems to work
 for me :-) But I'm no autoconf guru, so maybe I missed something?

This patch sort of begs the question what about enable-thread-safety-force?
That looks even more like a wart now than it did before.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Robert Haas

On Mon, Nov 23, 2009 at 8:46 PM, Greg Smith g...@2ndquadrant.com wrote:
 You know how people complain about how new contributors are treated here?
  Throwing out comments like this, that come off as belittling to other
 people's work, doesn't help.  All I was suggesting was that Dan wasn't
 developing this in complete isolation from the hackers community as Robert
 had feared, as will be obvious when we get to:

I still think it's better to have discussion on the mailing list than
elsewhere.  But we're doing that now, so, good.

 As far as other past discussion here that might be relevant, this patch
 includes a direct change to gram.y to support the new syntax.  You've
 already suggested before that it might be time to update COPY the same way
 EXPLAIN and now VACUUM have been overhauled to provide a more flexible
 options interface:
  http://archives.postgresql.org/pgsql-hackers/2009-09/msg00616.php  This
 patch might be more fuel for that idea.

FWIW, Tom already committed a patch by Emmanuel and myself that did this.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Pavel Stehule

2009/11/24 Hannu Krosing ha...@2ndquadrant.com:
 On Tue, 2009-11-24 at 05:00 -0800, Daniel Farina wrote:
 On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:

  then syntax should be:
 
  COPY table TO streamname(parameters)
 
  COPY table TO filestream('/tmp/foo.dta') ...
  COPY table TO dblinkstream(connectionstring) ...

 You probably meant

 COPY table TO dblinkstream(connectionstring, table)

 ?

 I like this one quite a bit...it's a bit like an aggregate, except the
 initial condition can be set in a rather function-callish way.

 But that does seem to require making a DDL command, which leaves a
 nice green field.

 not necessarily DDL, maybe just a copystream type and a set of
 functions creating objects of that type.

 if you make it a proper type with input and output function, then you
 can probably use it in statements like this

 COPY table TO (select stream::copystream from streams where id = 7);

 COPY table TO 'file:/tmp/outfile':: copystream;

 COPY table TO 'dblink::connectstring':: copystream;

it interesting - but still you have to have DDL for declaring stream.
It is analogous to function:

CREATE FUNCTION 

SELECT 'foo'::regprocedure

but syntax COPY table TO copystream is good idea. I like it.



 In particular, we could then make as many hooks,
 flags, and options as we wanted, but sometimes there is a paradox of
 choice...I just did not want to anticipate on Postgres being friendly
 to a new DDL command when writing this the first time.

 fulltext lived for quite some time as set of types and functions before
 it was glorified with its own DDL syntax.

What is DDL? Wrapper for insert to system catalog.

so we can have table pg_catalog.copystream

and for first testing

CREATE OR REPLACE FUNCTION register_copystream(regproc, regproc, regproc ...)

if we will happy - than it is one day work for support statement

CREATE COPYSTREAM ( ...

Regards
Pavel Stehule


 It may be good to have the same approach here - do it as a set of types
 and functions first, think about adding DDL once it has stabilised
 enough


 --
 Hannu Krosing   http://www.2ndQuadrant.com
 PostgreSQL Scalability and Availability
   Services, Consulting and Training




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Robert Haas

On Mon, Nov 23, 2009 at 9:37 PM, Andrew Dunstan and...@dunslane.net wrote:


 Greg Smith wrote:

 I haven't heard anything from Andrew about ragged CVS import either.  I
 think that ultimately those features are useful, but just exceed what the
 existing code could be hacked to handle cleanly.

 The patch is attached for your edification/amusement. I have backpatched it
 to 8.4 for the client that needed it, and it's working just fine. I didn't
 pursue it when it was clear that it was not going to be accepted. COPY
 returning text[] would allow us to achieve the same thing, a bit more
 verbosely, but it would be a lot more work to develop.

FWIW, I've somewhat come around to this idea.  But I might be the only one.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning option for COPY

2009-11-24 Thread Emmanuel Cecchet


Itagaki Takahiro wrote:

I just edited a wiki page for this discussion.
I hope it can be a help.
http://wiki.postgresql.org/wiki/Table_partitioning
  

I guess the problem of handling user triggers is still open.
If we allow triggers on partitions, badly written logic could lead to 
infinite loops in routing. In the case of COPY, an after statement 
trigger could change all the routing decisions taken for each row. I am 
not sure what the semantic should be if you have triggers defined on the 
parent and child tables. Which triggers do you fire if the insert is on 
the parent table but the tuple ends up in a child table?
If the new implementation hides the child tables, it might be safer to 
not allow triggers on child tables altogether and use the parent table 
as the single point of entry to access the partition (and define 
triggers). With the current proposed implementation, would it be 
possible to define a view using child tables?


Emmanuel

--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [INTERFACES] ecpg 8.3 - 8.4 migration

2009-11-24 Thread Michael Meskes

On Sat, Nov 21, 2009 at 12:49:33PM -0800, Mark Richardson wrote:
 I'm pretty sure the problem I found is related to this, but I found that ecpg
 doesn't process booleans correctly- this was in a old version of postgres (I
 think it was 7.4.2).  I traced it down in the code, and there is a section
 that defines the values to be yes or no, but then further processing
 looks for true or false, so the end result is that a boolean is ALWAYS
 false, because the 2 filters don't match.   

Is this a bug that you saw back then in 7 something, or a bug you still see?

 If you're interested in more detail, I have code fixes (they are at work so 
 I'll send on Monday).

Please send them. I'm interested.

Michael

-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: mes...@jabber.org
VfL Borussia! Forca Barca! Go SF 49ers! Use: Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL patch review

2009-11-24 Thread Ross J. Reedstrom

On Tue, Nov 24, 2009 at 03:12:43PM +0900, KaiGai Kohei wrote:
 Itagaki Takahiro wrote:
  * CREATE TABLE tbl (col integer AS SECURITY_CONTEXT = '...')
Is the syntax AS SECURITY_CONTEXT natural in English?
 
 We need to put a reserved token, such as AS, prior to the SECURITY_CONTEXT
 to avoid syntax conflicts to DEFAULT b_expr option.

Does WITH work? Seems to read better to me:

CREATE TABLE tbl (col integer WITH SECURITY CONTEXT [...])

-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer  Admin, Research Scientistphone: 713-348-6166
The Connexions Project  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Initial refactoring of plperl.c - draft [PATCH]

2009-11-24 Thread Tim Bunce

I've started work on the enhancements to plperl I outlined on pg-general
(XXX thread)
I have a working implementation of those changes, plus some performance
enhancements, that I'm now re-working into a clean set of tested and
polished patches.

This patch is a first step that doesn't add any extra functionality.
It refactors the internals to make adding the extra functionality
easier (and more clearly visible).

Changes in this patch:

- Changed MULTIPLICITY check from runtime to compiletime.
No loads the large Config module.
- Changed plperl_init_interp() to return new interp
and not alter the global interp_state
- Moved plperl_safe_init() call into check_interp().
- Removed plperl_safe_init_done state variable
as interp_state now covers that role.
- Changed plperl_create_sub() to take a plperl_proc_desc argument.
- Simplified return value handling in plperl_create_sub.
- Adds a test for the effect of the utf8fix function.

I'd appreciate any feedback on the patch.

The next step I plan is to move the large multi-line string literal
macros (PERLBOOT, SAFE_OK etc) into external perl code files.
That'll make refactoring, extending and maintaining that perl
code far simpler.

A $pkglib_path/perl directory seems an appropriate place for this code.
Assuming that's okay, how should I go about creating that directory and
putting files there during build/installation?

I could implement that and include it as an update to this patch, or as
a new patch on top. Which would be preferable?

Tim.

*** a/src/pl/plperl/expected/plperl.out
--- b/src/pl/plperl/expected/plperl.out
***
*** 555,557  $$ LANGUAGE plperl;
--- 555,564 
  SELECT perl_spi_prepared_bad(4.35) as double precision;
  ERROR:  type does_not_exist does not exist at line 2.
  CONTEXT:  PL/Perl function perl_spi_prepared_bad
+ --
+ -- Test compilation of unicode regex
+ --
+ CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$
+ # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576
+ return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley
+ $$ LANGUAGE plperl;
*** a/src/pl/plperl/plperl.c
--- b/src/pl/plperl/plperl.c
***
*** 125,133  typedef enum
  } InterpState;
  
  static InterpState interp_state = INTERP_NONE;
- static bool can_run_two = false;
  
- static bool plperl_safe_init_done = false;
  static PerlInterpreter *plperl_trusted_interp = NULL;
  static PerlInterpreter *plperl_untrusted_interp = NULL;
  static PerlInterpreter *plperl_held_interp = NULL;
--- 125,131 
***
*** 147,153  Datum		plperl_call_handler(PG_FUNCTION_ARGS);
  Datum		plperl_validator(PG_FUNCTION_ARGS);
  void		_PG_init(void);
  
! static void plperl_init_interp(void);
  
  static Datum plperl_func_handler(PG_FUNCTION_ARGS);
  static Datum plperl_trigger_handler(PG_FUNCTION_ARGS);
--- 145,151 
  Datum		plperl_validator(PG_FUNCTION_ARGS);
  void		_PG_init(void);
  
! static PerlInterpreter *plperl_init_interp(void);
  
  static Datum plperl_func_handler(PG_FUNCTION_ARGS);
  static Datum plperl_trigger_handler(PG_FUNCTION_ARGS);
***
*** 156,166  static plperl_proc_desc *compile_plperl_function(Oid fn_oid, bool is_trigger);
  
  static SV  *plperl_hash_from_tuple(HeapTuple tuple, TupleDesc tupdesc);
  static void plperl_init_shared_libs(pTHX);
  static HV  *plperl_spi_execute_fetch_result(SPITupleTable *, int, int);
  static SV  *newSVstring(const char *str);
  static SV **hv_store_string(HV *hv, const char *key, SV *val);
  static SV **hv_fetch_string(HV *hv, const char *key);
! static SV  *plperl_create_sub(char *proname, char *s, bool trusted);
  static SV  *plperl_call_perl_func(plperl_proc_desc *desc, FunctionCallInfo fcinfo);
  static void plperl_compile_callback(void *arg);
  static void plperl_exec_callback(void *arg);
--- 154,165 
  
  static SV  *plperl_hash_from_tuple(HeapTuple tuple, TupleDesc tupdesc);
  static void plperl_init_shared_libs(pTHX);
+ static void plperl_safe_init(void);
  static HV  *plperl_spi_execute_fetch_result(SPITupleTable *, int, int);
  static SV  *newSVstring(const char *str);
  static SV **hv_store_string(HV *hv, const char *key, SV *val);
  static SV **hv_fetch_string(HV *hv, const char *key);
! static void plperl_create_sub(plperl_proc_desc *desc, char *s);
  static SV  *plperl_call_perl_func(plperl_proc_desc *desc, FunctionCallInfo fcinfo);
  static void plperl_compile_callback(void *arg);
  static void plperl_exec_callback(void *arg);
***
*** 226,232  _PG_init(void)
  	hash_ctl,
  	HASH_ELEM);
  
! 	plperl_init_interp();
  
  	inited = true;
  }
--- 225,232 
  	hash_ctl,
  	HASH_ELEM);
  
! 	plperl_held_interp = plperl_init_interp();
! 	interp_state = INTERP_HELD;
  
  	inited = true;
  }
***
*** 311,322  _PG_init(void)
  	  elog(ERROR,'trusted Perl functions disabled -  \
  	  please upgrade Perl Safe module to version 2.09 or later');}]); }
  
-

Re: [HACKERS] Initial refactoring of plperl.c - draft [PATCH]

2009-11-24 Thread Tom Lane

Tim Bunce tim.bu...@pobox.com writes:
 The next step I plan is to move the large multi-line string literal
 macros (PERLBOOT, SAFE_OK etc) into external perl code files.
 That'll make refactoring, extending and maintaining that perl
 code far simpler.

That does not seem like it accomplishes anything from the user's
perspective except to add more points of failure.  To name just one:
would you like to debug a problem that stems from a version mismatch
between plperl.so and the external perl files?  I wouldn't.

I can see wanting the *source* to be separate files, but having it as a
compiled constant string in the executable seems like the right thing.

Since this language is obviously going to require Perl to be present at
compile time, running a little Perl script to convert the source into a
C literal wouldn't be a problem AFAICS.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Syntax conflicts in frame clause

2009-11-24 Thread Hitoshi Harada

Rewriting my frame support types patch to allow any expression in
PRECEDING/FOLLOWING clause, I found the syntax below in PG conflicts:

frame_extent: frame_bound { ... }
| BETWEEN frame_bound AND frame_bound { ... }
;
frame_bound: UNBOUNDED PRECEDING { ... }
| UNBOUNDED FOLLOWING { ... }
| CURRENT_P ROW { ... }
| a_expr PRECEDING { ... }
| a_expr FOLLOWING {  }
;

because a_expr (and of course b_expr) contains BETWEEN as
type_func_name_keyword, which means the starting BETWEEN in
frame_extend is completely ambiguous. When I tried to move BETWEEN to
reserved_keyword, it was solved as expected.

In my poor mind there's no way to avoid this situation as long as you
keep BETWEEN as type_func_name_keyword, but could anyone have a
solution for this?

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Kevin Grittner

Andrew Dunstan and...@dunslane.net wrote:
 
 Part of the motivation for allowing inline blocks was to allow for 
 conditional logic. So you can do things like:
 
   DO $$
 
   begin
   if not exists (select 1 from pg_tables
  where schemaname = 'foo'
  and tablename = 'bar') then
create table foo.bar (x int, y text);
   end if;
   end;
 
   $$;
 
 
 It's a bit more verbose (maybe someone can streamline it) but it
 does give you CINE (for whatever flavor of CINE you want), as well
 as lots more complex possibilities than we can conceivably build
 into SQL.
 
So we're conceding that this is a valid need and people will now have
a way to meet it.  Is the argument against having CINE syntax that it
would be more prone to error than the above, or that the code would be
so large and complex as to create a maintenance burden?  (Is there
some other reason I'm missing?)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] garbage in psql -l

2009-11-24 Thread Oleg Bartunov


Hi there,

I have problem with CVS HEAD (noticed a week or so ago) - 
psql -l show garbage instead of -|+. Looks, like utf-8 symbols used

instead that ascii characters.

  List of databases
NameБ■┌  Owner   Б■┌ Encoding Б■┌  Collation  Б■┌Ctype
Б■┌   Access privileges
Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■╪Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■╪Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■╪Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■
╪Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■╪Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─Б■─
 contrib_regression Б■┌ postgres Б■┌ UTF8 Б■┌ ru_RU.UTF-8 Б■┌ ru_RU.UTF-8 
Б■┌
 nomao  Б■┌ postgres Б■┌ UTF8 Б■┌ ru_RU.UTF-8 Б■┌ ru_RU.UTF-8 
Б■┌
 postgres   Б■┌ postgres Б■┌ UTF8 Б■┌ ru_RU.UTF-8 Б■┌ ru_RU.UTF-8 
Б■┌
 template0  Б■┌ postgres Б■┌ UTF8 Б■┌ ru_RU.UTF-8 Б■┌ ru_RU.UTF-8 
Б■┌ =c/postgres
Б∙╥  Б∙╥  Б∙╥ Б∙╥ 
Б∙▌ postgres=CTc/postgres


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 So we're conceding that this is a valid need and people will now have
 a way to meet it.  Is the argument against having CINE syntax that it
 would be more prone to error than the above, or that the code would be
 so large and complex as to create a maintenance burden?

The argument against CINE is that it's unsafe.  The fragment proposed
by Andrew is no safer, of course, but it could be made safe by adding
additional checks that the properties of the existing object are what
the script expects.  So in principle that's an acceptable approach,
whereas CINE will never be safe.

But actually I thought we had more or less concluded that CREATE OR
REPLACE LANGUAGE would be acceptable (perhaps only if it's given
without any extra args?).  Or for that matter there seems to be enough
opinion on the side of just installing plpgsql by default.  CINE is
a markedly inferior alternative to either of those.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 The argument against CINE is that it's unsafe.
 
By no means rhetorically, is that based on the assumption that the
statement would not validate that the existing object (if any) matches
the supplied definition?
 
 The fragment proposed by Andrew is no safer, of course, but it could
 be made safe by adding additional checks that the properties of the
 existing object are what the script expects.
 
Again, not rhetorically, is that assuming an error-free mapping of the
CREATE statement to all the related system tables -- each time it is
written by every user, individually?
 
 So in principle that's an acceptable approach,
 whereas CINE will never be safe.
 
Only with the most simplistic implementation of CINE.  I really don't
see how that assertion holds up if there is checking of the supplied
definition against the existing object.  Even the most simplistic
definition is arguably safer than CREATE OR REPLACE, since that can
destroy existing data.  An implementation which does the checking that
you suggest, reviewed by this community to confirm that it is correct,
would seem to beat out most people's home-grown attempts to write what
you suggest.
 
 But actually I thought we had more or less concluded that CREATE OR
 REPLACE LANGUAGE would be acceptable (perhaps only if it's given
 without any extra args?).  Or for that matter there seems to be
 enough opinion on the side of just installing plpgsql by default. 
 CINE is a markedly inferior alternative to either of those.
 
It sounded pretty much like a consensus on installing by default to
me; however, that doesn't seem like it has anything to do with
Andrew's example or my reply to it.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Syntax conflicts in frame clause

2009-11-24 Thread Tom Lane

Hitoshi Harada umi.tan...@gmail.com writes:
 Rewriting my frame support types patch to allow any expression in
 PRECEDING/FOLLOWING clause, I found the syntax below in PG conflicts:

Yeah, we ran into that in the original WINDOW patch IIRC, and found some
solution to it that got taken out again when the functionality was cut
down for 8.4.  You might want to look back to see what that looked like;
I think we avoided reserving BETWEEN at the cost of making the
productions a bit more redundant.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Tom Lane

Oleg Bartunov o...@sai.msu.su writes:
 I have problem with CVS HEAD (noticed a week or so ago) - 
 psql -l show garbage instead of -|+. Looks, like utf-8 symbols used
 instead that ascii characters.

Hm, you only see it for -l and not for all tabular output?  That's
a bit strange.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Oleg Bartunov


On Tue, 24 Nov 2009, Tom Lane wrote:


Oleg Bartunov o...@sai.msu.su writes:

I have problem with CVS HEAD (noticed a week or so ago) -
psql -l show garbage instead of -|+. Looks, like utf-8 symbols used
instead that ascii characters.


Hm, you only see it for -l and not for all tabular output?  That's
a bit strange.


yes, I'm surprising myself. Teodor has no problem, but he is under FreeBSD, 
while I use slackware linux. Here is ldd output.


pg-h...@zen:~/cvs/HEAD/pgsql$ ldd /usr/local/pgsql-head/bin/psql
linux-gate.so.1 =  (0xe000)
libpq.so.5 = /usr/local/pgsql-head/lib/libpq.so.5 (0xb7f33000)
libz.so.1 = /usr/lib/libz.so.1 (0xb7ef8000)
libreadline.so.5 = /usr/lib/libreadline.so.5 (0xb7ec8000)
libtermcap.so.2 = /lib/libtermcap.so.2 (0xb7ec4000)
libcrypt.so.1 = /lib/libcrypt.so.1 (0xb7e92000)
libdl.so.2 = /lib/libdl.so.2 (0xb7e8d000)
libm.so.6 = /lib/libm.so.6 (0xb7e67000)
libc.so.6 = /lib/libc.so.6 (0xb7d07000)
/lib/ld-linux.so.2 (0xb7f4f000)




regards, tom lane



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Robert Haas

On Tue, Nov 24, 2009 at 12:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 So we're conceding that this is a valid need and people will now have
 a way to meet it.  Is the argument against having CINE syntax that it
 would be more prone to error than the above, or that the code would be
 so large and complex as to create a maintenance burden?

 The argument against CINE is that it's unsafe.  The fragment proposed
 by Andrew is no safer, of course, but it could be made safe by adding
 additional checks that the properties of the existing object are what
 the script expects.  So in principle that's an acceptable approach,
 whereas CINE will never be safe.

Well, there can be methods extrinsic to the system for controlling
this sort of thing.  For example, I can provide a script, using CINE,
that will either install version 2 of my app into some database or
that will upgrade an existing version 1 installation to version 2.
It's true that if someone has taken the version-1 schema and made
manual modifications to it, then things might blow up.  But, I can
tell people that they shouldn't do that, or the upgrade script might
break.  If they do and it does then they get to keep both pieces.
Even if I do the whole thing in PL/pgsql, I'm still not going to check
for every stupid thing someone might have done to break the schema...
I think the cat is already out of the bag on this one, and it's just a
matter of whether we're willing to provide some convenient syntax or
leave people to hand-code it.

 But actually I thought we had more or less concluded that CREATE OR
 REPLACE LANGUAGE would be acceptable (perhaps only if it's given
 without any extra args?).

I'm not sure there's any value in that restriction - seems more
confusing than helpful.

 Or for that matter there seems to be enough
 opinion on the side of just installing plpgsql by default.  CINE is
 a markedly inferior alternative to either of those.

For languages, yes.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 The argument against CINE is that it's unsafe.
 
 By no means rhetorically, is that based on the assumption that the
 statement would not validate that the existing object (if any) matches
 the supplied definition?

If it did so, that would be outside the apparent meaning of the
command, which is to do nothing if an object of that name exists.
That's why we've gone with CREATE OR REPLACE instead.

 The fragment proposed by Andrew is no safer, of course, but it could
 be made safe by adding additional checks that the properties of the
 existing object are what the script expects.
 
 Again, not rhetorically, is that assuming an error-free mapping of the
 CREATE statement to all the related system tables -- each time it is
 written by every user, individually?

Yes, I'd expect the user to custom-code it, because it's not clear
exactly which properties the script would be depending on and which ones
it's okay to allow to vary.  To take just one example, is it okay if the
object ownership is different from current user?  That might be fine,
or it might be catastrophic (suppose the script is going to issue GRANT
commands that presuppose particular ownership; if it's different you
could be left with security holes).
 
 Only with the most simplistic implementation of CINE.  I really don't
 see how that assertion holds up if there is checking of the supplied
 definition against the existing object.  Even the most simplistic
 definition is arguably safer than CREATE OR REPLACE, since that can
 destroy existing data.

How exactly would it do that?  You seem to be postulating non-obvious
or not-as-currently-implemented semantics for both variants of the
command, so you had better explain exactly what you think they'd be.

(I agree that CREATE OR REPLACE on a table might be expected to destroy
existing data, but we don't have such a command and there is no proposal
to make one.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Tom Lane

Oleg Bartunov o...@sai.msu.su writes:
 On Tue, 24 Nov 2009, Tom Lane wrote:
 Hm, you only see it for -l and not for all tabular output?  That's
 a bit strange.

 yes, I'm surprising myself. Teodor has no problem, but he is under FreeBSD, 
 while I use slackware linux. Here is ldd output.

What's your locale environment?  (env | grep ^L would help.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Oleg Bartunov


On Tue, 24 Nov 2009, Tom Lane wrote:


Oleg Bartunov o...@sai.msu.su writes:

On Tue, 24 Nov 2009, Tom Lane wrote:

Hm, you only see it for -l and not for all tabular output?  That's
a bit strange.



yes, I'm surprising myself. Teodor has no problem, but he is under FreeBSD,
while I use slackware linux. Here is ldd output.


What's your locale environment?  (env | grep ^L would help.)


LC_COLLATE=ru_RU.KOI8-R
LANG=C
LC_CTYPE=ru_RU.KOI8-R

I had no problem with this.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, Nov 24, 2009 at 12:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But actually I thought we had more or less concluded that CREATE OR
 REPLACE LANGUAGE would be acceptable (perhaps only if it's given
 without any extra args?).

 I'm not sure there's any value in that restriction - seems more
 confusing than helpful.

The point would be to reduce the risk that you're changing the language
definition in a surprising way.  Extra args would imply that you're
trying to install a non-default definition of the language.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Tom Lane

Oleg Bartunov o...@sai.msu.su writes:
 On Tue, 24 Nov 2009, Tom Lane wrote:
 What's your locale environment?  (env | grep ^L would help.)

 LC_COLLATE=ru_RU.KOI8-R
 LANG=C
 LC_CTYPE=ru_RU.KOI8-R

Hmm, I can duplicate the fact that psql -l uses utf8 characters
(because it connects to the postgres DB which has utf8 encoding)
but for me, ordinary selects within psql use the utf8 characters
too.  Do you perhaps have something in ~/.psqlrc to force a different
client encoding?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Peter Eisentraut

On tis, 2009-11-24 at 21:32 +0300, Oleg Bartunov wrote:
 On Tue, 24 Nov 2009, Tom Lane wrote:
 
  Oleg Bartunov o...@sai.msu.su writes:
  On Tue, 24 Nov 2009, Tom Lane wrote:
  Hm, you only see it for -l and not for all tabular output?  That's
  a bit strange.
 
  yes, I'm surprising myself. Teodor has no problem, but he is under FreeBSD,
  while I use slackware linux. Here is ldd output.
 
  What's your locale environment?  (env | grep ^L would help.)
 
 LC_COLLATE=ru_RU.KOI8-R
 LANG=C
 LC_CTYPE=ru_RU.KOI8-R
 
 I had no problem with this.

Seems like a mismatch between client encoding and actual locale
environment.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Oleg Bartunov


On Tue, 24 Nov 2009, Tom Lane wrote:


Oleg Bartunov o...@sai.msu.su writes:

On Tue, 24 Nov 2009, Tom Lane wrote:

What's your locale environment?  (env | grep ^L would help.)



LC_COLLATE=ru_RU.KOI8-R
LANG=C
LC_CTYPE=ru_RU.KOI8-R


Hmm, I can duplicate the fact that psql -l uses utf8 characters
(because it connects to the postgres DB which has utf8 encoding)
but for me, ordinary selects within psql use the utf8 characters
too.  Do you perhaps have something in ~/.psqlrc to force a different
client encoding?



yes, 
set client_encoding to KOI8;


but it never hurts me ! I tried to comment it, but it doesn't helped.
Notice, psql from 8.4 works nice.



regards, tom lane



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Scott Marlowe

On Tue, Nov 24, 2009 at 11:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Nov 24, 2009 at 12:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But actually I thought we had more or less concluded that CREATE OR
 REPLACE LANGUAGE would be acceptable (perhaps only if it's given
 without any extra args?).

 I'm not sure there's any value in that restriction - seems more
 confusing than helpful.

 The point would be to reduce the risk that you're changing the language
 definition in a surprising way.  Extra args would imply that you're
 trying to install a non-default definition of the language.

But if you'd installed it that way before, wouldn't you then need the
arguments this time to have them match?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Oleg Bartunov


On Tue, 24 Nov 2009, Peter Eisentraut wrote:


On tis, 2009-11-24 at 21:32 +0300, Oleg Bartunov wrote:

On Tue, 24 Nov 2009, Tom Lane wrote:


Oleg Bartunov o...@sai.msu.su writes:

On Tue, 24 Nov 2009, Tom Lane wrote:

Hm, you only see it for -l and not for all tabular output?  That's
a bit strange.



yes, I'm surprising myself. Teodor has no problem, but he is under FreeBSD,
while I use slackware linux. Here is ldd output.


What's your locale environment?  (env | grep ^L would help.)


LC_COLLATE=ru_RU.KOI8-R
LANG=C
LC_CTYPE=ru_RU.KOI8-R

I had no problem with this.


Seems like a mismatch between client encoding and actual locale
environment.


why  8.4 has no real problem ?

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 If it did so, that would be outside the apparent meaning of the
 command, which is to do nothing if an object of that name exists.
 That's why we've gone with CREATE OR REPLACE instead.
 
I think that fail on existence of an object conflicting with given
definition is behavior which could be documented and rates fairly
low on my astonishment scale.  (I can't speak for anyone else.)
 
I am skeptical that, in the absence of built-in support for checking
the existing object against the supplied definition, people would
generally go any further than Andrew's example.  When they did, I'm
skeptical about how often they would get the details exactly right.
 
 Yes, I'd expect the user to custom-code it, because it's not clear
 exactly which properties the script would be depending on and which
 ones it's okay to allow to vary.  To take just one example, is it
 okay if the object ownership is different from current user?  That
 might be fine, or it might be catastrophic (suppose the script is
 going to issue GRANT commands that presuppose particular ownership;
 if it's different you could be left with security holes).
 
Yeah, that's an area which I figured would require some discussion. 
The best behavior isn't immediately clear to me in that regard.  I
didn't figure that arriving at some decision on that was necessarily
an insurmountable obstacle.  Similar issue with indexes, although the
answer there seems clearer (at least to me).
 
 (I agree that CREATE OR REPLACE on a table might be expected to
 destroy existing data, but we don't have such a command and there is
 no proposal to make one.)
 
There was, up-thread, discussion by multiple people of the desire to
have CINE for tables.  Andrew's example was specifically about an
alternative way of spelling that.  This branch of the thread has been
all about exactly that.  (Well, at least in my head.)  You asserted
that CREATE OR REPLACE was superior to CINE; I took it to be in
response to the discussion of CINE for tables, but I guess it was
just in the scope of languages.  Sorry for misinterpreting.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Tom Lane

Oleg Bartunov o...@sai.msu.su writes:
 why  8.4 has no real problem ?

Because we never tried to use utf8 table decoration before.  This
is collateral damage from Roger Leigh's recent patches.

The problem is evidently that Oleg is depending on ~/.psqlrc to
set client_encoding the way he wants it, but that file does not
get read for a psql -l invocation.  (Probably not for -c either.)

The locale environment really isn't at issue because we do not look
at it to establish client encoding.  Perhaps Oleg should be setting
PGCLIENTENCODING instead of depending on ~/.psqlrc, but I suspect
he's not the only one doing it that way.

There has been some talk of altering the rules for setting psql's
default client_encoding.  We could think about that, or we could
back off trying to use linestyle=unicode without an explicit setting.
If we do neither, I suspect we'll be hearing more complaints.  I'll
bet there are lots of people who are using database encoding = UTF8
but don't actually have unicode-capable terminal setups.  It's never
hurt them before, especially not if they aren't really storing any
non-ASCII data.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Peter Eisentraut

On tis, 2009-11-24 at 21:55 +0300, Oleg Bartunov wrote:
  Seems like a mismatch between client encoding and actual locale
  environment.
 
 why  8.4 has no real problem ?

Because table formatting with Unicode characters is a new feature.

Anyway, that patch to set the client encoding automatically from the
locale sounds even more useful now.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 Anyway, that patch to set the client encoding automatically from the
 locale sounds even more useful now.

I think you're being overoptimistic to assume that that's going to
eliminate the issue.  It might patch things for Oleg's particular
configuration; but the real problem IMO is that people are depending
on ~/.psqlrc to set encoding/locale related behavior, and that file
isn't read before executing -l/-c (not to mention -X).

I wonder whether the most prudent solution wouldn't be to prevent
default use of linestyle=unicode if ~/.psqlrc hasn't been read.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Tom Lane

Scott Marlowe scott.marl...@gmail.com writes:
 On Tue, Nov 24, 2009 at 11:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The point would be to reduce the risk that you're changing the language
 definition in a surprising way.  Extra args would imply that you're
 trying to install a non-default definition of the language.

 But if you'd installed it that way before, wouldn't you then need the
 arguments this time to have them match?

If you knew you'd installed it that way before, you wouldn't be
executing this command at all.  The use-case for commands like this
IMO is scripts that don't know exactly what the database state is.
The use-case for a script that is installing non-default language
parameters into somebody else's database seems pretty darn thin.

I'm not dead set on this by any means.  But it seems like it would
help reduce the risk of bad consequences from CREATE OR REPLACE
LANGUAGE.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Yes, I'd expect the user to custom-code it, because it's not clear
 exactly which properties the script would be depending on and which
 ones it's okay to allow to vary.  To take just one example, is it
 okay if the object ownership is different from current user?

 Yeah, that's an area which I figured would require some discussion. 
 The best behavior isn't immediately clear to me in that regard.  I
 didn't figure that arriving at some decision on that was necessarily
 an insurmountable obstacle.

The reason a script-driven solution seems attractive is exactly that
there doesn't seem to be a good one-size-fits-all behavior for complex
objects.

 There was, up-thread, discussion by multiple people of the desire to
 have CINE for tables.  Andrew's example was specifically about an
 alternative way of spelling that.  This branch of the thread has been
 all about exactly that.  (Well, at least in my head.)

I thought the thread was about CREATE LANGUAGE.  If you want to discuss
CINE in general it would probably be appropriate to start a different
thread about that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Updating column on row update

2009-11-24 Thread Robert Haas

On Tue, Nov 24, 2009 at 2:07 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 If it did so, that would be outside the apparent meaning of the
 command, which is to do nothing if an object of that name exists.
 That's why we've gone with CREATE OR REPLACE instead.

 I think that fail on existence of an object conflicting with given
 definition is behavior which could be documented and rates fairly
 low on my astonishment scale.  (I can't speak for anyone else.)

I think CINE should create the object if it does not exist and
otherwise do nothing.  It might be useful to have some kind of
consistency-checking behavior, but it would probably be more useful if
decoupled from CINE, and in any case, that's not what CREATE IF NOT
EXISTS means to me.

 I am skeptical that, in the absence of built-in support for checking
 the existing object against the supplied definition, people would
 generally go any further than Andrew's example.  When they did, I'm
 skeptical about how often they would get the details exactly right.

Bingo.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial refactoring of plperl.c - draft [PATCH]

2009-11-24 Thread Tim Bunce

On Tue, Nov 24, 2009 at 11:57:06AM -0500, Tom Lane wrote:
 Tim Bunce tim.bu...@pobox.com writes:
  The next step I plan is to move the large multi-line string literal
  macros (PERLBOOT, SAFE_OK etc) into external perl code files.
  That'll make refactoring, extending and maintaining that perl
  code far simpler.
 
 That does not seem like it accomplishes anything from the user's
 perspective except to add more points of failure.  To name just one:
 would you like to debug a problem that stems from a version mismatch
 between plperl.so and the external perl files?  I wouldn't.
 
 I can see wanting the *source* to be separate files, but having it as a
 compiled constant string in the executable seems like the right thing.
 
 Since this language is obviously going to require Perl to be present at
 compile time, running a little Perl script to convert the source into a
 C literal wouldn't be a problem AFAICS.

Okay, thanks. I'll take that route.

Tim.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Oleg Bartunov


On Tue, 24 Nov 2009, Tom Lane wrote:


Oleg Bartunov o...@sai.msu.su writes:

why  8.4 has no real problem ?


Because we never tried to use utf8 table decoration before.  This
is collateral damage from Roger Leigh's recent patches.

The problem is evidently that Oleg is depending on ~/.psqlrc to
set client_encoding the way he wants it, but that file does not
get read for a psql -l invocation.  (Probably not for -c either.)

The locale environment really isn't at issue because we do not look
at it to establish client encoding.  Perhaps Oleg should be setting
PGCLIENTENCODING instead of depending on ~/.psqlrc, but I suspect
he's not the only one doing it that way.


yes, PGCLIENTENCODING=KOI8 psql -l works as it should be



There has been some talk of altering the rules for setting psql's
default client_encoding.  We could think about that, or we could
back off trying to use linestyle=unicode without an explicit setting.
If we do neither, I suspect we'll be hearing more complaints.  I'll
bet there are lots of people who are using database encoding = UTF8
but don't actually have unicode-capable terminal setups.  It's never
hurt them before, especially not if they aren't really storing any
non-ASCII data.


what's benefit of using linestyle=unicode ? I like old ASCII style
for console.




regards, tom lane



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Tom Lane

Oleg Bartunov o...@sai.msu.su writes:
 what's benefit of using linestyle=unicode ? I like old ASCII style
 for console.

Well, I have to grant that it looks pretty spiffy on a unicode-enabled
display.  Whether that's enough reason to risk breaking things for
people with non-unicode-enabled displays is certainly worth debating.

Maybe we should just make the default be linestyle=ascii all the time,
and tell people to turn it on in their ~/.psqlrc if they want it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Oleg Bartunov


On Tue, 24 Nov 2009, Tom Lane wrote:


Oleg Bartunov o...@sai.msu.su writes:

what's benefit of using linestyle=unicode ? I like old ASCII style
for console.


Well, I have to grant that it looks pretty spiffy on a unicode-enabled
display.  Whether that's enough reason to risk breaking things for
people with non-unicode-enabled displays is certainly worth debating.

Maybe we should just make the default be linestyle=ascii all the time,
and tell people to turn it on in their ~/.psqlrc if they want it.


+1

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Robert Haas

On Tue, Nov 24, 2009 at 4:49 PM, Oleg Bartunov o...@sai.msu.su wrote:
 On Tue, 24 Nov 2009, Tom Lane wrote:

 Oleg Bartunov o...@sai.msu.su writes:

 what's benefit of using linestyle=unicode ? I like old ASCII style
 for console.

 Well, I have to grant that it looks pretty spiffy on a unicode-enabled
 display.  Whether that's enough reason to risk breaking things for
 people with non-unicode-enabled displays is certainly worth debating.

 Maybe we should just make the default be linestyle=ascii all the time,
 and tell people to turn it on in their ~/.psqlrc if they want it.

 +1

+1.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Roger Leigh

On Tue, Nov 24, 2009 at 02:19:27PM -0500, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  Anyway, that patch to set the client encoding automatically from the
  locale sounds even more useful now.
 
 I think you're being overoptimistic to assume that that's going to
 eliminate the issue.  It might patch things for Oleg's particular
 configuration; but the real problem IMO is that people are depending
 on ~/.psqlrc to set encoding/locale related behavior, and that file
 isn't read before executing -l/-c (not to mention -X).
 
 I wonder whether the most prudent solution wouldn't be to prevent
 default use of linestyle=unicode if ~/.psqlrc hasn't been read.

This problem is caused when there's a mismatch between the
client encoding and the user's locale.  We can detect this at
runtime and fall back to ASCII if we know they are incompatible.

Why don't we combine the two approaches we looked at so far:
1) The PG client encoding is UTF-8
2) The user's locale codeset (from nl_langinfo(CODESET)) is UTF-8

If *both* the conditions are satisfied simultaneously then we
are guaranteed that things will display correctly given what
the user has told us they wanted.  If only one is satisfied then
we remain using ASCII and problems such as the non-UTF-8-locale
mis-display seen here are avoided, while still allowing Unicode
display for users who have a UTF-8 locale as well as a UTF-8
client encoding (such as myself ;-)

This should be a one-liner patch to update the existing check.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature

Re: [HACKERS] SE-PgSQL patch review

2009-11-24 Thread KaiGai Kohei


Ross J. Reedstrom wrote:

On Tue, Nov 24, 2009 at 03:12:43PM +0900, KaiGai Kohei wrote:

Itagaki Takahiro wrote:

* CREATE TABLE tbl (col integer AS SECURITY_CONTEXT = '...')
  Is the syntax AS SECURITY_CONTEXT natural in English?

We need to put a reserved token, such as AS, prior to the SECURITY_CONTEXT
to avoid syntax conflicts to DEFAULT b_expr option.


Does WITH work? Seems to read better to me:

CREATE TABLE tbl (col integer WITH SECURITY CONTEXT [...])


It was conflicted. :(

--
KaiGai Kohei kai...@kaigai.gr.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Tom Lane

Roger Leigh rle...@codelibre.net writes:
 On Tue, Nov 24, 2009 at 02:19:27PM -0500, Tom Lane wrote:
 I wonder whether the most prudent solution wouldn't be to prevent
 default use of linestyle=unicode if ~/.psqlrc hasn't been read.

 This problem is caused when there's a mismatch between the
 client encoding and the user's locale.  We can detect this at
 runtime and fall back to ASCII if we know they are incompatible.

Well, no, that is *one* of the possible failure modes.  I've hit others
already in the short time that the patch has been installed.  The one
that's bit me most is that the locale environment seen by psql doesn't
necessarily match what my xterm at the other end of an ssh connection
is prepared to do --- which is something that psql simply doesn't have
a way to detect.  Again, this is something that's never mattered before
unless one was really pushing non-ASCII data around, and even then it
was often possible to be sloppy.

I'd be more excited about finding a way to use linestyle=unicode by
default if it had anything beyond cosmetic benefits.  But it doesn't,
and it's hard to justify ratcheting up the requirements for users to get
their configurations exactly straight when that's all they'll get for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] garbage in psql -l

2009-11-24 Thread Roger Leigh

On Tue, Nov 24, 2009 at 05:43:00PM -0500, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  On Tue, Nov 24, 2009 at 02:19:27PM -0500, Tom Lane wrote:
  I wonder whether the most prudent solution wouldn't be to prevent
  default use of linestyle=unicode if ~/.psqlrc hasn't been read.
 
  This problem is caused when there's a mismatch between the
  client encoding and the user's locale.  We can detect this at
  runtime and fall back to ASCII if we know they are incompatible.
 
 Well, no, that is *one* of the possible failure modes.  I've hit others
 already in the short time that the patch has been installed.  The one
 that's bit me most is that the locale environment seen by psql doesn't
 necessarily match what my xterm at the other end of an ssh connection
 is prepared to do --- which is something that psql simply doesn't have
 a way to detect.  Again, this is something that's never mattered before
 unless one was really pushing non-ASCII data around, and even then it
 was often possible to be sloppy.

Sure, but this type of misconfiguration is entirely outside the
purview of psql.  Everything else on the system, from man(1) to gcc
emacs and vi will be sending UTF-8 codes to your terminal for any
non-ASCII character they display.  While psql using UTF-8 for its
tables is certainly exposing the problem, in reality it was already
broken, and it's not psql's fault for using functionality the
system said was available.  It would equally break if you stored
non-ASCII characters in your UTF-8-encoded database and then ran
a SELECT query, since UTF-8 codes would again be sent to the
terminal.

For the specific case here, where the locale is KOI8-R, we can
determine at runtime that this isn't a UTF-8 locale and stay
using ASCII.  I'll be happy to send a patch in to correct this
specific case.

At least on GNU/Linux, checking nl_langinfo(CODESET) is considered
definitive for testing which character set is available, and it's
the standard SUS/POSIX interface for querying the locale.

 I'd be more excited about finding a way to use linestyle=unicode by
 default if it had anything beyond cosmetic benefits.  But it doesn't,
 and it's hard to justify ratcheting up the requirements for users to get
 their configurations exactly straight when that's all they'll get for it.

Bar the lack of nl_langinfo checking, once this is added we will go
out of our way to make sure that the system is capable of handling
UTF-8.  This is, IMHO, the limit of how far i/any/ tool should go to
handle things.  Worrying about misconfigured terminals, something
which is entirely the user's responsiblility, is I think a step too
far--going down this road means you'll be artificially limited to
ASCII, and the whole point of using nl_langinfo is to allow sensible
autoconfiguation, which almost all programs do nowadays.  I don't
think it makes sense to penalise the majority of users with
correctly-configured systems because a small minority have a
misconfigured terminal input encoding.  It is 2009, and all
contemporary systems support Unicode, and for the majority it is the
default.

Every one of the GNU utilities, plus most other free software,
localises itself using gettext, which in a UTF-8 locale, even
English locales, will transparently recode its output into the
locale codeset.  This hasn't resulted in major problems for
people using these tools; it's been like this way for years now.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature

Re: [HACKERS] Syntax conflicts in frame clause

2009-11-24 Thread Hitoshi Harada

2009/11/25 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 Rewriting my frame support types patch to allow any expression in
 PRECEDING/FOLLOWING clause, I found the syntax below in PG conflicts:

 Yeah, we ran into that in the original WINDOW patch IIRC, and found some
 solution to it that got taken out again when the functionality was cut
 down for 8.4.  You might want to look back to see what that looked like;
 I think we avoided reserving BETWEEN at the cost of making the
 productions a bit more redundant.

Oops, I've forgot that :(

http://archives.postgresql.org/message-id/6363.1229890...@sss.pgh.pa.us

At the time you put BETWEEN as reserved_keyword, but my trial showed
col_name_keyword can also be a candidate.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby and removing VACUUM FULL

2009-11-24 Thread Simon Riggs

On Sat, 2009-11-21 at 20:20 +0200, Heikki Linnakangas wrote:

 That causes some headaches for Hot Standby

I say leave HS as it is and we can clean up when we do the VFectomy.

It isn't really a headache, the code works easily enough. I agree its
ugly and it should eventually be removed.

Let's not make this any harder, or get involved with promises that we
may not be able to keep. I'd rather we had HS + SR than HS - VF for
example. VF is ugly but it isn't a priority.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby and removing VACUUM FULL

2009-11-24 Thread Simon Riggs

On Sat, 2009-11-21 at 23:00 +0200, Heikki Linnakangas wrote:
 Tom Lane wrote:
  Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  Tom Lane wrote:
  There's no equivalent of XLogArchivingActive()?
  
  XLogArchivingMode() == false enables us to skip WAL-logging in
  operations like CLUSTER or COPY, which is a big optimization. I don't
  see anything like that in Hot Standby. There is a few small things that
  could be skipped, but nothing noticeable.
  
  Huh?  Surely HS requires XLogArchivingMode as a prerequisite ...
 
 Oh, sure! But there's no switch that needs to be enabled in the master
 in addition to that.

We've tried hard to have it just work. But I wonder whether we should
have a parameter to allow performance testing on the master? If nobody
finds any issues then we can remove it again, or at least make it a
hidden developer option.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL patch review

2009-11-24 Thread KaiGai Kohei

KaiGai Kohei wrote:
 Ross J. Reedstrom wrote:
 On Tue, Nov 24, 2009 at 03:12:43PM +0900, KaiGai Kohei wrote:
 Itagaki Takahiro wrote:
 * CREATE TABLE tbl (col integer AS SECURITY_CONTEXT = '...')
   Is the syntax AS SECURITY_CONTEXT natural in English?
 We need to put a reserved token, such as AS, prior to the 
 SECURITY_CONTEXT
 to avoid syntax conflicts to DEFAULT b_expr option.

 Does WITH work? Seems to read better to me:

 CREATE TABLE tbl (col integer WITH SECURITY CONTEXT [...])

 It was conflicted. :(

BTW, we have two options, if we don't use AS token here.

1. It moves SECURITY to reserved keyword.
 We can represent SECURITY CONTEXT option for each columns quite
 natural, but it also has a pain. It disallow to use security
 as a column name.

2. Another syntax to support SECURITY CONTEXT
 For example:
  CREATE TABLE tbl_name (
  col_X_nameint primary key,
  col_Y_nametextdefault 'aaa'
  ) SECURITY CONTEXT ( 'label of the table',
   col_Y_name IS 'label of the column Y', ... );

 I don't have any preference between the current syntax and the new one.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby and removing VACUUM FULL

2009-11-24 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 Tom Lane wrote:
 There's no equivalent of XLogArchivingActive()?

 We've tried hard to have it just work. But I wonder whether we should
 have a parameter to allow performance testing on the master? If nobody
 finds any issues then we can remove it again, or at least make it a
 hidden developer option.

As long as there's not anything the master actually does differently
then I can't see where there'd be any performance testing to do.  What's
bothering me about this is that it seems likely that we'll find places
where the master has to do things differently.  I'd rather we made the
status visible; if we get through a release cycle without needing to
check it, we can always take the function out again.  But if we don't,
and then find out midway through the 8.5 release cycle that we need to
be able to check it, things could be a bit sticky.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] KNNGiST for knn-search

2009-11-24 Thread Simon Riggs

On Mon, 2009-11-23 at 20:44 +0300, Teodor Sigaev wrote:
 Old way:
 SELECT coordinates, (coordinates - '5.0,5.0'::point) AS dist FROM
 spots
 order by dist asc LIMIT 10;
 
 Time: 1024.242 ms
 
 knn-search:
 SELECT coordinates, (coordinates - '5.0,5.0'::point) AS dist FROM
 spots
 WHERE coordinates   '5.0,5.0'::point LIMIT 10;
 
 Time: 3.158 ms
 
 
 We didn't patch existing implementation of GiST for several reasons:
 
 1. KNNGiST is about 5% slower than GiST on non-knn search queries,
 like
contains or contained by, because of some overhead of new algorithm
 of
tree traversal
 2.  KNNGiST can't be used in  bitmap index scan, which destroys order
 of results,
We don't know the way to forbid bitmap index scan only for knn
 queries.
Current version of KNNGiST doesn't distinguish knn-search and usual
 search
and postgres doesn't know about ordered output from KNNGiST.

Sounds very cool.

Seems like you should look at the way sorted_path works in
query_planner().

If you have a query like this

  explain select col1 from s order by col1 limit 10;

then we currently understand that we should use an IndexScan for that.
We don't specifically exclude the bitmap scan, it's just that we know
that the results from the index are ordered and therefore the cost of
sorting the output need not be added. In the bitmap case the cost of the
sort must be added and that's enough to ensure we almost never do that.

I notice that a query like 

  explain select col1 from s order by abs(col1 - 5) limit 10;

is the one-dimensional equivalent of the type of query you're proposing
and that doesn't work either until you put an index on abs(col1 - 5),
then it just works, but only for k = 5.

Maybe you should look at the above query and see if there are any usable
similarities for the Knn index.

Part of your problem appears to be that cost_sort does not include
anything about the cost of the comparison operators for different
datatypes.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning option for COPY

2009-11-24 Thread Itagaki Takahiro


Emmanuel Cecchet m...@asterdata.com wrote:

 I guess the problem of handling user triggers is still open.
 If we allow triggers on partitions, badly written logic could lead to 
 infinite loops in routing.

Infinite loops are not a partition-related problem, no?
We can also find infinite loops in user defined functions,
recursive queries, etc. I think the only thing we can do for it
is to *stop* loops instead of prevention, like max_stack_depth.

 With the current proposed implementation, would it be 
 possible to define a view using child tables?

No, if you mean using a partition-view. I'm thinking we are moving
our implementation of partitioning from view-based to built-in feature.
Do you have any use-cases that requires view-based partitioning?
Was the inheritance-based partitioning not enough for it?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Jeff Davis

On Tue, 2009-11-24 at 00:54 -0800, Daniel Farina wrote:
 On Tue, Nov 24, 2009 at 12:29 AM, Hannu Krosing ha...@krosing.net wrote:
  COPY stdin TO udf();
 
 If stdin becomes (is?) a legitimate source of records, then this patch
 will Just Work.
 

STDIN is a source of bytes representing a set of records. Currently, the
first argument to COPY is a source or destination of records; and the
second argument is a source or destination of bytes representing a set
of records.

I think we want the first argument to remain a source or destination of
real records with types; that is, a table or perhaps a function. And we
want the second argument to remain a source or destination of bytes;
that is, a file or perhaps a function (albeit not the same kind as the
former function).

  COPY udf() FROM stdin;
 
 This is unaddressed, but I think it would be a good idea to consider
 enabling this kind of thing prior to application.

This makes much more sense, but it is a very different type of function
from the original proposal (which basically accepts a buffer). I agree
that it sounds useful and would be good for the sake of symmetry.

One use case may be a degree of data cleaning. For instance, you could
use a looser function definition, like udf(cstring, cstring, ...),
where all COPY does is break up the records into fields, and the
function can recover from type input failures using subtransactions.
Binary mode could do a similar thing with bytea.

However, I recommend that you don't try to generalize this as a data
cleanup feature that can handle ragged input. That seems like a separate
problem that will distract from the original use case.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL patch review

2009-11-24 Thread Itagaki Takahiro


KaiGai Kohei kai...@ak.jp.nec.com wrote:

CREATE TABLE tbl (...) SECURITY CONTEXT '...'
  * CREATE TABLE tbl (col integer AS SECURITY_CONTEXT = '...')
 
 We need to put a reserved token, such as AS, prior to the SECURITY_CONTEXT
 to avoid syntax conflicts to DEFAULT b_expr option.

There might be another idea to put security context in WITH options:
 1. CREATE TABLE tbl (...) WITH (security_context = '...')
 2. CREATE TABLE tbl (col integer WITH (security_context = '...') PRIMARY KEY)
If we use the syntax, '_' and '=' is reasonable.

BTW, I like to reverse the order of constraints and WITH options in
column definitions (2), but I could not solve shift/reduce errors
-- it might conflict with PRIMARY KEY WITH (index-parameters).

 - sepgsql_template1_getcon - pg_get_template1_secon
 - sepgsql_database_getcon  - pg_get_database_secon

Why do we need two version of functions for template1 and database?
The template1 database is the default template for CREATE DATABASE,
but we can also choose another one. Do we need to distinguish them?

  * It uses dedicated 'SExxx' error codes, but I think they should belong to
the same family of ERRCODE_INSUFFICIENT_PRIVILEGE (42501).
 I already uses predefined error code, if exist.

What I meant was: there are no problem to add new error codes for SE-PgSQL,
but I think the values of the codes should be '42xxx' because those errors
are still Class 42 - Access Rule Violation from the view of users.

   Internal structures 
  * Are the security labels enough stable?
We store security labels as text for each object and column.
 
 If the security labels get invalid due to the modification of SELinux
 configuration or other reasons, it considers the database objects are
 unlabeled.

I believe you have a plan to add row-level security checking in the future
version. Do you have some documentation about how to implement security
context for each row? I'm worrying about the on-disk representation.
Security labels stored in text format takes 20-40 bytes per row. It is not
negligibly-small, and might be hard to be treated because of variable-length.

We store OIDs for each row at the end of tuple header. If we also
store securty labels in the same way, will we need some kinds of
securty label to OID converter in the future?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby and removing VACUUM FULL

2009-11-24 Thread Greg Stark

On Wed, Nov 25, 2009 at 2:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 As long as there's not anything the master actually does differently
 then I can't see where there'd be any performance testing to do.  What's
 bothering me about this is that it seems likely that we'll find places
 where the master has to do things differently.  I'd rather we made the
 status visible; if we get through a release cycle without needing to
 check it, we can always take the function out again.  But if we don't,
 and then find out midway through the 8.5 release cycle that we need to
 be able to check it, things could be a bit sticky.

Well the only thing that's been discussed is having vacuum require a
minimum age before considering a transaction visible to all to reduce
the chance of conflicts on cleanup records. But that would require an
actual tunable, not just a flag. And it's something that could
conceivably be desirable even if you're not running a HS setup (if
someone ever reimplements time travel for example).

So I'm not sure adding a flag before there's an actual need for it is
necessarily going to be helpful. It may turn out to be insufficient
even if we have a flag.

And then there's the question of what the slave should do if the
master was running without the flag. Do we make it throw an error?
Does that mean the master needs to insert information to that effect
in the wal logs? What if you shut down the master switch the flag and
start it up again and you had a standby reading those logs all along.
Will it be able to switch to HS mode now? We won't know until we know
why this flag was necessary and what change in behaviour it might have
caused.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby and removing VACUUM FULL

2009-11-24 Thread Tom Lane

Greg Stark gsst...@mit.edu writes:
 Well the only thing that's been discussed is having vacuum require a
 minimum age before considering a transaction visible to all to reduce
 the chance of conflicts on cleanup records.

[ shrug... ]  Call me Cassandra.  I am not concerned about what has or
has not been discussed.  I am concerned about what effects we are going
to be blindsided by, a few months from now when it is too late to
conveniently add a way to detect that the system is being run as an HS
master.  If we design it in, perhaps we won't need it --- but if we
design it out, we will need it.  You have heard of Finagle's law, no?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning option for COPY

2009-11-24 Thread Emmanuel Cecchet


Itagaki Takahiro wrote:

Emmanuel Cecchet m...@asterdata.com wrote:

  

I guess the problem of handling user triggers is still open.
If we allow triggers on partitions, badly written logic could lead to 
infinite loops in routing.


Infinite loops are not a partition-related problem, no?
We can also find infinite loops in user defined functions,
recursive queries, etc. I think the only thing we can do for it
is to *stop* loops instead of prevention, like max_stack_depth.
  
I was thinking a trigger on child1 updating the partition key forcing 
the tuple to move to child2. And then a trigger on child2 updating the 
key again to move the tuple back to child1. You end up with an infinite 
loop.
With the current proposed implementation, would it be 
possible to define a view using child tables?



No, if you mean using a partition-view. I'm thinking we are moving
our implementation of partitioning from view-based to built-in feature.
Do you have any use-cases that requires view-based partitioning?
Was the inheritance-based partitioning not enough for it?
  
Nevermind, I was thinking about the implications of materialized views 
but Postgres does not have materialized views!


I have other questions related to create table but I will post them in 
the 'syntax for partitioning' thread.


Thanks
Emmanuel

--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Syntax for partitioning

2009-11-24 Thread Emmanuel Cecchet


Hi,

Sorry for commenting only now but I think that we need to be able to 
store the partitions in different tablespaces. Even if originally the 
create table creates all partitions in the same tablespace, individual 
partitions should be allowed to be moved in different tablespaces using 
alter table or alter partition.  I think that other databases allows the 
user to define a tablespace for each partition in the create table 
statement.
In a warehouse, you might want to split your partitions on different 
volumes and over time, move older partitions to storage with higher 
compression if that data is not to be accessed frequently anymore. 
Altering tablespaces for partitions is important in that context.


Are you also planning to provide partitioning extensions to 'create 
table as'?


Thanks
Emmanuel


Here is a WIP partitioning patch. The new syntax are:
  1. CREATE TABLE parent (...);
  2. ALTER TABLE parent PARTITION BY { RANGE | LIST } ( key );
  3. CREATE TABLE child (...);
  4. ALTER TABLE child INHERIT parent AS PARTITION VALUES ...;

We can also use CREATE TABLE PARTITION BY as 1+2+3+4 and
CREATE PARTITION as 3+4. I think INHERIT AS PARTITION is rarely
used typically, but such orthogonality seems to be cleaner.

The most complex logic of the patch is in ATExecAddInherit(). It scans
existing partitions and generate CHECK constraint for the new partition.

Any comments to the design?  If no objections, I'd like to stop adding
features in this CommitFest and go for remaining auxiliary works
-- pg_dump, object dependency checking, documentation, etc.

  

-
 Catalog changes
-


In addition to pg_partition, I added pg_inherits.inhvalues field.
The type of field is anyarray and store partition values.
For range partition, an upper bound value is stored in the array.
For list partition, list values are stored in it. These separated
value fields will be useful to implement partition triggers in the
future. In contrast, reverse engineering of check constraints is messy.

CATALOG(pg_inherits,2611) BKI_WITHOUT_OIDS
{
Oid inhrelid;
Oid inhparent;
int4inhseqno;
anyarrayinhvalues;  /* values for partition */
} FormData_pg_inherits;

  

CREATE TABLE pg_partition (
partrelid oid REFERENCES oid ON pg_class,-- partitioned table oid
partopr   oid REFERENCES oid ON pg_operator, -- operator to compare keys
partkind  char, -- kind of partition: 'R' (range) or 'L' (list)
partkey   text,   -- expression tree of partition key
PRIMARY KEY (partrelid)
) WITHOUT OIDS;



--
 Limitations and Restrictions
--
* We can create a new partition as long as partitioning keys
  are not conflicted with existing partitions. Especially,
  we cannot add any partitions if we have overflow partitions
  because a new partition always split the overflow partition.

* We cannot reuse an existing check constraint as a partition
  constraint. ALTER TABLE INHERIT AS PARTITION brings on
  a table scan to add a new CHECK constraint.

* No partition triggers nor planner and executor improvements.
  It would come in the future development.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

  




  



--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Syntax for partitioning

2009-11-24 Thread Itagaki Takahiro


Emmanuel Cecchet m...@asterdata.com wrote:

 I think that other databases allows the 
 user to define a tablespace for each partition in the create table 
 statement.

WITH and TABLESPACE clause are supported for each partition.

 =# CREATE TABLE parent (...) PARTITION BY (key)
(
  PARTITION child_1 VALUES LESS THAN 10 WITH (...) TABLESPACE tbs_1
);
 =# CREATE PARTITION child_2 ON parent
VALUES LESS THAN 20 WITH (...) TABLESPACE tbl_2;

 Are you also planning to provide partitioning extensions to 'create 
 table as'?

Ah, I forgot that. It would be possible to have the feature.
There are no syntax issues. But it would be done after we support
automatic INSERT routing. We can create the table will partitions,
but tuples are not divided into child partitions because we have
no insert-triggers at the time of CREATE TABLE AS.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL patch review

2009-11-24 Thread KaiGai Kohei

Itagaki Takahiro wrote:
 KaiGai Kohei kai...@ak.jp.nec.com wrote:
 
   CREATE TABLE tbl (...) SECURITY CONTEXT '...'
 * CREATE TABLE tbl (col integer AS SECURITY_CONTEXT = '...')
 We need to put a reserved token, such as AS, prior to the SECURITY_CONTEXT
 to avoid syntax conflicts to DEFAULT b_expr option.
 
 There might be another idea to put security context in WITH options:
  1. CREATE TABLE tbl (...) WITH (security_context = '...')
  2. CREATE TABLE tbl (col integer WITH (security_context = '...') PRIMARY KEY)
 If we use the syntax, '_' and '=' is reasonable.
 
 BTW, I like to reverse the order of constraints and WITH options in
 column definitions (2), but I could not solve shift/reduce errors
 -- it might conflict with PRIMARY KEY WITH (index-parameters).

If we put SECURITY CONTEXT clause prior to the column constraints,
there are no syntax conflicts. However, it seems to me not intuitive.

  like, CREATE TABLE tbl (col int SECURITY CONTEXT '...' NOT NULL);

In addition, if we inject security_context in the relation options,
the way to fetch it is far different from other database objects.

Instead, what is your opinion for the syntax?

 CREATE TABLE tbl (...) SECURITY CONTEXT ('label', col='label', ...);

 When col= is omitted, it means an explicit security context of the
 new table. Otherwise, it means an explicit one of the given column.

And, for consistency,

 CREATE DATABASE dbname SECURITY CONTEXT ('label');
 CREATE SCHEMA scname SECURITY CONTEXT ('label');


 - sepgsql_template1_getcon - pg_get_template1_secon
 - sepgsql_database_getcon  - pg_get_database_secon
 
 Why do we need two version of functions for template1 and database?
 The template1 database is the default template for CREATE DATABASE,
 but we can also choose another one. Do we need to distinguish them?

They have different purposes.

The sepgsql_database_getcon() prints out a security context of the
database for the given OID in human-readable form.

The sepgsql_template1_getcon() returns a security context to be
assigned on the initial database from SELinux configuration.
Typically, it is configured at /etc/selinux/targeted/contexts/sepgsql_contexts.
If not exist, it asks SELinux a default security context as an initial
database.
Then, initdb uses the result to assign initial security context of the
managed database objects.

 * It uses dedicated 'SExxx' error codes, but I think they should belong to
   the same family of ERRCODE_INSUFFICIENT_PRIVILEGE (42501).
 I already uses predefined error code, if exist.
 
 What I meant was: there are no problem to add new error codes for SE-PgSQL,
 but I think the values of the codes should be '42xxx' because those errors
 are still Class 42 - Access Rule Violation from the view of users.

Ahh, OK. I'll fix it.

  Internal structures 
 * Are the security labels enough stable?
   We store security labels as text for each object and column.
 If the security labels get invalid due to the modification of SELinux
 configuration or other reasons, it considers the database objects are
 unlabeled.
 
 I believe you have a plan to add row-level security checking in the future
 version. Do you have some documentation about how to implement security
 context for each row? I'm worrying about the on-disk representation.
 Security labels stored in text format takes 20-40 bytes per row. It is not
 negligibly-small, and might be hard to be treated because of variable-length.
 
 We store OIDs for each row at the end of tuple header. If we also
 store securty labels in the same way, will we need some kinds of
 securty label to OID converter in the future?

Yes, it was contained in the earlier proposition with full-set functionalities.

http://wiki.postgresql.org/wiki/SEPostgreSQL_Architecture#Interaction_between_pg_security_system_catalog

In SELinux model, massive number of objects shares a limited number of
security context (e.g more than 100 tables may have a same one), this
design (it stores security label OID within the tuple header) is well
suitable for database objects.

BTW, I plan the following steps for the row-level security.
| * A facility to put security label OID within the tuple header.
| * System column support to print out the security context.
|   (This system column shall be writable to relabel)
| * Pure-SQL row-level security checks, something like Oracle Private
|   Database which allows user defined access control decision function.
| * SELinux aware row-level checks on the virtual private database stuff.
V   It can be implemented as one of the decision making functions.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Jeff Davis

On Tue, 2009-11-24 at 14:39 +0100, Pavel Stehule wrote:
 a) good designed C API  like:
 
initialise_functions(fcinfo) -- std fcinfo
consument_process_tuple(fcinfo) -- gets standard row -- Datum
 dvalues[] + Row description
producent_process_tuple(fcinfo) -- returns standard row  -- Datum
 dvalues[] + Row description (look on SRF API)
terminate_funnction(fcinfo)
 

Don't you still need the functions to accept an argument of type
internal? Otherwise, we lose the ability to copy a buffer to the dblink
connection, which was the original motivation.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Tom Lane

Jeff Davis pg...@j-davis.com writes:
 Don't you still need the functions to accept an argument of type
 internal? Otherwise, we lose the ability to copy a buffer to the dblink
 connection, which was the original motivation.

If you do that, then there is no possibility of ever using this feature
except with C-coded functions, which seems to me to remove most of
whatever use-case there was.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Pavel Stehule

2009/11/25 Jeff Davis pg...@j-davis.com:
 On Tue, 2009-11-24 at 14:39 +0100, Pavel Stehule wrote:
 a) good designed C API  like:

    initialise_functions(fcinfo) -- std fcinfo
    consument_process_tuple(fcinfo) -- gets standard row -- Datum
 dvalues[] + Row description
    producent_process_tuple(fcinfo) -- returns standard row  -- Datum
 dvalues[] + Row description (look on SRF API)
    terminate_funnction(fcinfo)


 Don't you still need the functions to accept an argument of type
 internal? Otherwise, we lose the ability to copy a buffer to the dblink
 connection, which was the original motivation.


It depends on design. I don't thing so internal is necessary. It is
just wrong design.

Pavel


 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Syntax for partitioning

2009-11-24 Thread Greg Smith

I just made a few updates to 
http://wiki.postgresql.org/wiki/Table_partitioning , merging in the 
stuff that had been on the ToDo page and expanding the links to 
discussion on this list a bit.  The number of submitted patches over the 
last couple of years that handle some subset of the desired feature set 
here is really remarkable when you see them all together.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 8:45 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 It depends on design. I don't thing so internal is necessary. It is
 just wrong design.

Depends on how lean you want to be when doing large COPY...right now
the cost is restricted to having to call a function pointer and a few
branches.  If you want to take SQL values, then the semantics of
function calling over a large number of rows is probably notably more
expensive, although I make no argument against the fact that the
non-INTERNAL version would give a lot more people more utility.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Jeff Davis

On Tue, 2009-11-24 at 23:44 -0500, Tom Lane wrote:
 If you do that, then there is no possibility of ever using this feature
 except with C-coded functions, which seems to me to remove most of
 whatever use-case there was.

It fits the use case involving dblink (or dblink-like modules).

Maybe the patch's performance should be tested with and without copying
the buffer, to see if we're losing anything significant. If we can do
almost as well copying the data and passing that as a bytea value to the
function, then I agree that would be better.

I still don't see any reason to force it to be record by record though.
If the point is to push data from a table into a remote table, why
should the copied data be translated out of binary format into a record,
and then back into binary form to send to the remote system?

Currently, the second argument to copy is a source or destination of
bytes, not records. So forcing it to deal with records is inconsistent.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby and removing VACUUM FULL

2009-11-24 Thread Simon Riggs

On Wed, 2009-11-25 at 03:12 +, Greg Stark wrote:
 On Wed, Nov 25, 2009 at 2:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  As long as there's not anything the master actually does differently
  then I can't see where there'd be any performance testing to do.  What's
  bothering me about this is that it seems likely that we'll find places
  where the master has to do things differently.  I'd rather we made the
  status visible; if we get through a release cycle without needing to
  check it, we can always take the function out again.  But if we don't,
  and then find out midway through the 8.5 release cycle that we need to
  be able to check it, things could be a bit sticky.
 
 Well the only thing that's been discussed is having vacuum require a
 minimum age before considering a transaction visible to all to reduce
 the chance of conflicts on cleanup records. But that would require an
 actual tunable, not just a flag. And it's something that could
 conceivably be desirable even if you're not running a HS setup (if
 someone ever reimplements time travel for example).

I will add this also, if it looks simple to do so. Even if we yank it
out later better to have the code for discussion purposes than just a
conceptual bikeshed.

 So I'm not sure adding a flag before there's an actual need for it is
 necessarily going to be helpful. It may turn out to be insufficient
 even if we have a flag.

Same situation as in archiving.

The debate was eventually carried that we should have
archive_mode = on
archive_ =  for additional parameters

 And then there's the question of what the slave should do if the
 master was running without the flag. Do we make it throw an error?

Well, it can't even enter HS mode, so no error needed.

 Does that mean the master needs to insert information to that effect
 in the wal logs? What if you shut down the master switch the flag and
 start it up again and you had a standby reading those logs all along.
 Will it be able to switch to HS mode now? We won't know until we know
 why this flag was necessary and what change in behaviour it might have
 caused.

I'm more comfortable running a new machine when it has an off switch.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL patch review

2009-11-24 Thread KaiGai Kohei

 * It uses dedicated 'SExxx' error codes, but I think they should belong to
   the same family of ERRCODE_INSUFFICIENT_PRIVILEGE (42501).
 I already uses predefined error code, if exist.
 What I meant was: there are no problem to add new error codes for SE-PgSQL,
 but I think the values of the codes should be '42xxx' because those errors
 are still Class 42 - Access Rule Violation from the view of users.
 
 Ahh, OK. I'll fix it.

I also think ERRCODE_INVALID_SECURITY_CONTEXT is suitable for the Access
Rule Violation class ('44xxx').

However, it seems to me ERRCODE_SELINUX_INTERNAL_ERROR should be moved
to the System Error class ('58xxx'), because it will be raised due to
the problem on communicating with SELinux, not access violations.

And, we may be able to remove ERRCODE_SELINUX_AUDIT_LOG, because audit
logs are generated on access violation events (in most case, if security
policy is right), so ERRCODE_INSUFFICIENT_PRIVILEGE might be suitable
to call ereport(LOG, ...) with an audit log message.

Isn't it strange in manner?

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 9:13 PM, Jeff Davis pg...@j-davis.com wrote:


 I still don't see any reason to force it to be record by record though.
 If the point is to push data from a table into a remote table, why
 should the copied data be translated out of binary format into a record,
 and then back into binary form to send to the remote system?

 Currently, the second argument to copy is a source or destination of
 bytes, not records. So forcing it to deal with records is inconsistent.

You are correct.  It so happens as an artifact of how COPY is written
that things are delivered row-by-row, but at some fundamental level it
does not matter were that not the case...

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Pavel Stehule

2009/11/25 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 8:45 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 It depends on design. I don't thing so internal is necessary. It is
 just wrong design.

 Depends on how lean you want to be when doing large COPY...right now
 the cost is restricted to having to call a function pointer and a few
 branches.  If you want to take SQL values, then the semantics of
 function calling over a large number of rows is probably notably more
 expensive, although I make no argument against the fact that the
 non-INTERNAL version would give a lot more people more utility.

I believe so using an internal minimalize necessary changes in COPY
implementation. Using a funcapi needs more work inside COPY -  you
have to take some functionality from COPY to stream functions.
Probably the most slow operations is parsing - calling a input
functions. This is called once every where. Second slow operation is
reading from network - it is same. So I don't see too much reasons,
why non internal implementation have to be significant slower than
your actual implementation. I am sure, so it needs more work.

What is significant - when I better join COPY and some streaming
function, then I don't need use tuplestore - or SRF functions. COPY
reads data directly.


 fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Syntax for partitioning

2009-11-24 Thread Emmanuel Cecchet


Greg Smith wrote:
I just made a few updates to 
http://wiki.postgresql.org/wiki/Table_partitioning , merging in the 
stuff that had been on the ToDo page and expanding the links to 
discussion on this list a bit.  The number of submitted patches over 
the last couple of years that handle some subset of the desired 
feature set here is really remarkable when you see them all together.


Should we add the 'WITH (...) TABLESPACE tbs' options to the syntax 
since they are supported?

Do we support ALTER ... SET TABLESPACE?

Emmanuel

--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Daniel Farina

On Tue, Nov 24, 2009 at 9:35 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2009/11/25 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 8:45 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 It depends on design. I don't thing so internal is necessary. It is
 just wrong design.

 Depends on how lean you want to be when doing large COPY...right now
 the cost is restricted to having to call a function pointer and a few
 branches.  If you want to take SQL values, then the semantics of
 function calling over a large number of rows is probably notably more
 expensive, although I make no argument against the fact that the
 non-INTERNAL version would give a lot more people more utility.

 I believe so using an internal minimalize necessary changes in COPY
 implementation. Using a funcapi needs more work inside COPY -  you
 have to take some functionality from COPY to stream functions.
 Probably the most slow operations is parsing - calling a input
 functions. This is called once every where. Second slow operation is
 reading from network - it is same. So I don't see too much reasons,
 why non internal implementation have to be significant slower than
 your actual implementation. I am sure, so it needs more work.

You are probably right.  We could try coercing to bytea and back out
to bytes, although it seems like a superfluous cost to force
*everyone* to pay just to get the same bytes to a network buffer.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby and removing VACUUM FULL

2009-11-24 Thread Greg Stark

On Wed, Nov 25, 2009 at 3:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark gsst...@mit.edu writes:
 Well the only thing that's been discussed is having vacuum require a
 minimum age before considering a transaction visible to all to reduce
 the chance of conflicts on cleanup records.

 [ shrug... ]  Call me Cassandra.  I am not concerned about what has or
 has not been discussed.  I am concerned about what effects we are going
 to be blindsided by, a few months from now when it is too late to
 conveniently add a way to detect that the system is being run as an HS
 master.  If we design it in, perhaps we won't need it --- but if we
 design it out, we will need it.  You have heard of Finagle's law, no?

Well the point here was that the only inkling of a possible need for
this that we have is going to require more than an on/off switch
anyways. That's likely to be true of any need which arises.

And you didn't answer my questions about the semantics of this switch
will be. That a replica which starts up while reading wal logs
generated by this database will refuse connections even if it's
configured to allow them? How will it determine what the switch was on
the master? The value of the switch at what point in time? The answers
to these questions seem to depend on what the need which triggered the
existence of the switch was.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Syntax for partitioning

2009-11-24 Thread Itagaki Takahiro


Emmanuel Cecchet m...@asterdata.com wrote:

 Should we add the 'WITH (...) TABLESPACE tbs' options to the syntax 
 since they are supported?

Added the description.

 Do we support ALTER ... SET TABLESPACE?

DROP/ALTER PARTITION are synonyms for DROP/ALTER TABLE.
SET TABLESPACE is also supported. Added the description.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Pavel Stehule

2009/11/25 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 9:35 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 2009/11/25 Daniel Farina drfar...@gmail.com:
 On Tue, Nov 24, 2009 at 8:45 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 It depends on design. I don't thing so internal is necessary. It is
 just wrong design.

 Depends on how lean you want to be when doing large COPY...right now
 the cost is restricted to having to call a function pointer and a few
 branches.  If you want to take SQL values, then the semantics of
 function calling over a large number of rows is probably notably more
 expensive, although I make no argument against the fact that the
 non-INTERNAL version would give a lot more people more utility.

 I believe so using an internal minimalize necessary changes in COPY
 implementation. Using a funcapi needs more work inside COPY -  you
 have to take some functionality from COPY to stream functions.
 Probably the most slow operations is parsing - calling a input
 functions. This is called once every where. Second slow operation is
 reading from network - it is same. So I don't see too much reasons,
 why non internal implementation have to be significant slower than
 your actual implementation. I am sure, so it needs more work.


internal is important (for performance) for aggregation function -
where is protection under repeated alloc/free memory - it work well
and it is +/- ugly hack. We cannot do some things well - simply there
are missing some support. Nobody calculated with very large string,
array concatenation in design time - It is reason, why I am against to
using it.

 You are probably right.  We could try coercing to bytea and back out
 to bytes, although it seems like a superfluous cost to force
 *everyone* to pay just to get the same bytes to a network buffer.


I am not sure if this is good analogy. Only filestream or network
stream is stream of bytes. From any sophisticated stream I am taking
tuples - database stream, SOAP stream. I  agree, so dblink could to
returns binary compatible records - but it is one special and
exclusive case. Sure,  important and have to calculated. Still I am
thinking so dblink to postgres is other hack and should be replaced).

 fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Jeff Davis

On Tue, 2009-11-24 at 21:42 -0800, Daniel Farina wrote:
 You are probably right.  We could try coercing to bytea and back out
 to bytes, although it seems like a superfluous cost to force
 *everyone* to pay just to get the same bytes to a network buffer.

Well, I suppose only performance will tell. Copying a buffer is sure to
be faster than invoking all of the type input/output functions, or even
send/recv, so perhaps it's not a huge penalty.

My disagreement with the row-by-row approach is more semantics than
performance. COPY translates records to bytes and vice-versa, and your
original patch maintains those semantics.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

2009-11-24 Thread Jeff Davis

On Wed, 2009-11-25 at 06:35 +0100, Pavel Stehule wrote:
 I believe so using an internal minimalize necessary changes in COPY
 implementation. Using a funcapi needs more work inside COPY -  you
 have to take some functionality from COPY to stream functions.
 Probably the most slow operations is parsing - calling a input
 functions. This is called once every where. Second slow operation is
 reading from network - it is same. So I don't see too much reasons,
 why non internal implementation have to be significant slower than
 your actual implementation. I am sure, so it needs more work.

I apologize, but I don't understand what you're saying. Can you please
restate with some examples?

It seems like you're advocating that we move records from a table into a
function using COPY. But that's not what COPY normally does: COPY
normally translates records to bytes or bytes to records.

Moving records from a table to a function can be done with:
  SELECT myfunc(mytable) FROM mytable;
already. The only problem is if you want initialization/destruction. But
I'm not convinced that COPY is the best tool to provide that.

Moving records from a function to a table can be done with:
  INSERT INTO mytable SELECT * FROM myfunc();
And that already works fine.

So what use case are you concerned about?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 110 matches

Mail list logo