Re: [HACKERS] BufferAccessStrategy for bulk insert

2008-10-30 Thread Simon Riggs

On Thu, 2008-10-30 at 23:05 -0400, Robert Haas wrote:
> > Whatever timings you have are worth publishing.
> 
> Here are the timings for copying the first ten million integers into a
> one-column table created in the same transaction, with and without the
> patch.  As you can see, now that I've corrected my previous error of
> not putting CREATE TABLE and COPY in the same transaction, the savings
> are quite substantial, about 15%.  Nice!

I had faith. ;-)

Can you test whether using the buffer access strategy is a win or a
loss? Most of that gain is probably coming from the reduction in
pinning.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg and array_accum (patch)

2008-10-30 Thread Jeff Davis
On Wed, 2008-10-29 at 00:08 -0400, Robert Haas wrote:
> It's worth noting that this is the third version of this idea that has
> been submitted.  Ian Caulfield submitted a patch to add this, and so
> did I.  Someone should probably look at all three of them and compare.
> 

If we include a function named array_accum(), it should return an empty
array on no input to match the function in the docs:

http://www.postgresql.org/docs/8.3/static/xaggr.html

Your function returns NULL on no input, which seems more like
array_agg().

Aside from that, I'm pretty open to anything, as long as one of our
patches makes it. If there are potential problems with the standard
(where we don't want to implement a violation), we should just do
array_accum(). If not, we might as well do the standard array_agg(),
perhaps without the ORDER BY clause.

We could also do both, because it is a little annoying to coalesce the
result or array_agg().

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we create a record of more than one field?

2008-10-30 Thread Tom Lane
"Jaime Casanova" <[EMAIL PROTECTED]> writes:
> in the spanish list someone reports that the following function
> doesn't even compile:

> CREATE FUNCTION select_temporal(OUT valor integer) RETURNS SETOF record AS
> $BODY$
> begin
> return query select 1::integer;
> return;
> end;
> $BODY$
> LANGUAGE 'plpgsql' VOLATILE;

> and this is the error message
> ERROR:  function result type must be integer because of OUT parameters

It's not a record, it's an integer... if you had more than one out
parameter then "record" would be sensible ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] can we create a record of more than one field?

2008-10-30 Thread Jaime Casanova
Hi,

in the spanish list someone reports that the following function
doesn't even compile:

CREATE FUNCTION select_temporal(OUT valor integer) RETURNS SETOF record AS
$BODY$
begin
return query select 1::integer;
return;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

and this is the error message
ERROR:  function result type must be integer because of OUT parameters

i was expecting the result to be a record because of the SETOF record.
it's a bug or intentional?

this is in 8.3.4

-- 
regards,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BufferAccessStrategy for bulk insert

2008-10-30 Thread Robert Haas
> Whatever timings you have are worth publishing.

Here are the timings for copying the first ten million integers into a
one-column table created in the same transaction, with and without the
patch.  As you can see, now that I've corrected my previous error of
not putting CREATE TABLE and COPY in the same transaction, the savings
are quite substantial, about 15%.  Nice!

Trunk:
Time: 18931.516 ms
Time: 18251.732 ms
Time: 17284.274 ms
Time: 15900.131 ms
Time: 16439.617 ms

Patch:
Time: 14852.123 ms
Time: 15673.759 ms
Time: 15776.450 ms
Time: 14160.266 ms
Time: 13374.243 ms

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BufferAccessStrategy for bulk insert

2008-10-30 Thread Robert Haas
> You should try profiling the patch. You can count the invocations of the
> buffer access routines to check its all working in the right ratios.

*goes and learns how to do profile PostgreSQL*

OK, that was a good suggestion.  It looks like part of my problem here
is that I didn't put the CREATE TABLE and the COPY into the same
transaction.  As a result, a lot of time was spent on XLogInsert.
Modified the test case, new profiling results attached.

...Robert


gmon.trunk.txt.gz
Description: GNU Zip compressed data


gmon.patched.txt.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange query behavior where clause produces odd behavior on '>' query

2008-10-30 Thread Dann Corbit
> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED]
> Sent: Thursday, October 30, 2008 6:34 PM
> To: Dann Corbit
> Cc: [HACKERS]; Sherry Griffin
> Subject: Re: [HACKERS] Strange query behavior where clause produces
odd
> behavior on '>' query
> 
> "Dann Corbit" <[EMAIL PROTECTED]> writes:
> >> What encoding/locale are you using?
> 
> > Whatever the default encoding/locale is.
> 
> "Whatever" is the wrong answer here.  I just finished verifying that
> the
> sort order you're complaining about is the expected ordering in some
> locales.  I suggest that you take the trouble to find out.

English (United States) is my locale.
 
> > Are you unable to reproduce it?
> 
> Well, I see this on a Fedora machine:
> 
> $ cat foo
> specd
> SPECD
> $ sort foo
> SPECD
> specd
> $ LANG=en_US sort foo
> specd
> SPECD
> $

The compare works as it should.
The only bug was in my understanding.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sun Studio compiler warnings

2008-10-30 Thread Robert Haas
Ooooh yeah.  Time for some caffeine.

...Robert

On Thu, Oct 30, 2008 at 9:34 PM, Alvaro Herrera
<[EMAIL PROTECTED]> wrote:
> Robert Haas escribió:
>> > The closing semicolon is strictly speaking not allowed here.  We could
>> > remove it, but that would probably upset pgindent?
>> >
>> > I recall that we used to have a bunch of similar problems with the AIX
>> > compilers a long time ago.  Does anyone recall the solution, and do we 
>> > still
>> > care?  (Note that it's only a warning in this case.)
>>
>> How about the good old
>>
>> do {
>> ...
>> } while (0)
>>
>> trick?
>
> That can't be used because the macro is defining a completely new
> function.
>
> --
> Alvaro Herrerahttp://www.CommandPrompt.com/
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange query behavior where clause produces odd behavior on '>' query

2008-10-30 Thread Tom Lane
"Dann Corbit" <[EMAIL PROTECTED]> writes:
>> What encoding/locale are you using?  

> Whatever the default encoding/locale is.

"Whatever" is the wrong answer here.  I just finished verifying that the
sort order you're complaining about is the expected ordering in some
locales.  I suggest that you take the trouble to find out.

> Are you unable to reproduce it?

Well, I see this on a Fedora machine:

$ cat foo
specd
SPECD
$ sort foo
SPECD
specd
$ LANG=en_US sort foo
specd
SPECD
$ 

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sun Studio compiler warnings

2008-10-30 Thread Alvaro Herrera
Robert Haas escribió:
> > The closing semicolon is strictly speaking not allowed here.  We could
> > remove it, but that would probably upset pgindent?
> >
> > I recall that we used to have a bunch of similar problems with the AIX
> > compilers a long time ago.  Does anyone recall the solution, and do we still
> > care?  (Note that it's only a warning in this case.)
> 
> How about the good old
> 
> do {
> ...
> } while (0)
> 
> trick?

That can't be used because the macro is defining a completely new
function.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange query behavior where clause produces odd behavior on '>' query

2008-10-30 Thread Dann Corbit
> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED]
> Sent: Thursday, October 30, 2008 5:31 PM
> To: Dann Corbit
> Cc: [HACKERS]; Sherry Griffin
> Subject: Re: [HACKERS] Strange query behavior where clause produces
odd
> behavior on '>' query
> 
> "Dann Corbit" <[EMAIL PROTECTED]> writes:
> > The following query:
> > SELECT * FROM Customers_connxstore where customerid > 'specd'
> > Returns the row containing Customers_connxstore.customerid ==
'SPECD'
> 
> What encoding/locale are you using?  

Whatever the default encoding/locale is.
I did not define any custom encoding, locale, or collating sequence.

> And while I'm asking, which PG
> version?

All versions from PostgreSQL 7.1.3 to "PostgreSQL 8.3.3, compiled by
Visual C++ build 1400" seem to display this behavior.
Are you unable to reproduce it?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sun Studio compiler warnings

2008-10-30 Thread Robert Haas
> The closing semicolon is strictly speaking not allowed here.  We could
> remove it, but that would probably upset pgindent?
>
> I recall that we used to have a bunch of similar problems with the AIX
> compilers a long time ago.  Does anyone recall the solution, and do we still
> care?  (Note that it's only a warning in this case.)

How about the good old

do {
...
} while (0)

trick?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange query behavior where clause produces odd behavior on '>' query

2008-10-30 Thread Tom Lane
"Dann Corbit" <[EMAIL PROTECTED]> writes:
> The following query:
> SELECT * FROM Customers_connxstore where customerid > 'specd'
> Returns the row containing Customers_connxstore.customerid == 'SPECD'

What encoding/locale are you using?  And while I'm asking, which PG version?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Strange query behavior where clause produces odd behavior on '>' query

2008-10-30 Thread Dann Corbit
The following query:
SELECT * FROM Customers_connxstore where customerid > 'specd'

Returns the row containing Customers_connxstore.customerid == 'SPECD'

I would expect to get that row if the query was:
SELECT * FROM Customers_connxstore where customerid >= 'specd'

The other operators (<, <=, >, =, !=) all work as expected.

Sample file to reproduce the problem:
http://cap.connx.com/bugreport/pgbug.sql.bz2

Is this a known issue with PostgreSQL or for some reason the desired
behavior?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG_PAGE_LAYOUT_VERSION 5 - time for change

2008-10-30 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> ... 3b sounds good until you
>> reflect that a genuinely variable chunk size would preclude random
>> access to sub-ranges of a toast value.  

> Hm, Heikki had me convinced it wouldn't but now that I try to explain it I
> can't get it to work. I think the idea is you start a scan at the desired
> offset and scan until you reach a chunk which overruns the end of the desired
> piece. However you really need to start scanning at the last chunk *prior* to
> the desired offset.

Yeah, that was my conclusion too.

> I think you can actually do this with btrees but I don't know if our apis
> support it. If you scan to find the first chunk > the desired offset and then
> scan backwards one tuple you should be looking at the chunk in which the
> desired offset lies.

Well, that might work but it would typically cost you an extra fetch.
Do we really have a use case for variable chunk size that is worth the
cost?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG_PAGE_LAYOUT_VERSION 5 - time for change

2008-10-30 Thread Gregory Stark
Tom Lane <[EMAIL PROTECTED]> writes:

>> 2) Add page type (e.g. btree) and subtype (e.g. metapage) flag into page 
>> header. 
>> I think It will be useful when we will use share buffer for clog.
>
> I think this is a pretty bad idea, because it'll eat space on every page
> for data that is only useful to indexes.  I don't believe that clog will
> find it interesting either.  To share buffers with clog will require
> a change in buffer lookup tags, not buffer contents.

Another example application which came to mind, if we ever wanted to do
something like retail vacuum, pruning, or hint bit setting from bgwriter it
would have to know how to tell heap pages apart from index pages. I'm not sure
whether that would have to be on the page or if it could be in the buffertag
as well?

If we do decide we want to do this it wouldn't have to be very much space. 16
page types with 16 subtypes each would be plenty which would fit on a single
byte.

>> 3) TOAST modification
>>a) TOAST table per attribute
>>b) replace chunk id with offset+variable chunk size
>>c) add column identification into first chunk
>
> I don't like 3a any more than Greg does.  3b sounds good until you
> reflect that a genuinely variable chunk size would preclude random
> access to sub-ranges of a toast value.  

Hm, Heikki had me convinced it wouldn't but now that I try to explain it I
can't get it to work. I think the idea is you start a scan at the desired
offset and scan until you reach a chunk which overruns the end of the desired
piece. However you really need to start scanning at the last chunk *prior* to
the desired offset.

I think you can actually do this with btrees but I don't know if our apis
support it. If you scan to find the first chunk > the desired offset and then
scan backwards one tuple you should be looking at the chunk in which the
desired offset lies.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG_PAGE_LAYOUT_VERSION 5 - time for change

2008-10-30 Thread Tom Lane
Zdenek Kotala <[EMAIL PROTECTED]> writes:
> 1) HeapTupleHeader modification

> typedef struct HeapTupleFields
> {
>   TransactionId t_xmin;   /* inserting xact ID */
>  TransactionId t_xmax;   /* deleting or locking xact ID */

>  union
>  {
>  CommandId   t_cid;
>  TransactionId   t_xvac;   /* VACUUM FULL xact ID */
>  }   t_field3;
>uint16  t_infomask;
> } HeapTupleFields;

This is unworkable (hint: the compiler will decide sizeof the struct
must be a multiple of 4).  I am also frightened to death of the proposal
to swap various bits around between infomask and infomask2 --- that is
*guaranteed* to break code silently.  And you didn't explain exactly
what it buys, anyway.  Not space savings in the Datum form; alignment
issues will prevent that.

> 2) Add page type (e.g. btree) and subtype (e.g. metapage) flag into page 
> header. 
> I think It will be useful when we will use share buffer for clog.

I think this is a pretty bad idea, because it'll eat space on every page
for data that is only useful to indexes.  I don't believe that clog will
find it interesting either.  To share buffers with clog will require
a change in buffer lookup tags, not buffer contents.

> 3) TOAST modification
>a) TOAST table per attribute
>b) replace chunk id with offset+variable chunk size
>c) add column identification into first chunk

I don't like 3a any more than Greg does.  3b sounds good until you
reflect that a genuinely variable chunk size would preclude random
access to sub-ranges of a toast value.  A column ID might be worth
adding for robustness purposes, though reducing the toast chunk payload
size to make that possible will cause you fits for in-place upgrade.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE command

2008-10-30 Thread David Rowley
Peter Eisentraut Wrote:
> If I read this right, SQL would allow writing
> 
> TABLE foo;

Interesting; I managed to find it in the spec:


4) The 
TABLE 
is equivalent to the 
( SELECT * FROM  )


So going by that would the patch also have to support something like:

WITH a AS (SELECT * FROM b)
TABLE a; ?

I'd probably find it hard to find a use case. I'm too used to using SELECT *
FROM .. in psql. On the other hand last night I read a good web page
comparing the most popular RDBMS' for spec. compliance and PostgreSQL
probably was the most compliant all of the ones listed, (at least on the
topics covered). Oracle fails badly on '' IS NULL being true.

I enjoy seeing more spec compliant things being added. But on the other
hand, going with Tom's comments, if its lots of work for little gain... 




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] User defined I/O conversion casts

2008-10-30 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Here's a patch. It seems pretty straightforward, I'll apply this 
> tomorrow, barring objections. Note to self: bump the catalog version.

Looks sane in a fast once-over.  I didn't cross-check the pg_cast.h
data changes, but that's what the regression test check is for ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP: Automatic view update rules

2008-10-30 Thread Bernd Helmle


Please find attached my current  patch for automatic view update rules. 
This is a stripped down version of my former patch discussed before 8.3 
without CHECK OPTION support and less invasive changes to the rewriter 
itself.


I'm currently cleaning up the code with all occurences of multiple base 
relations (in progress) hence supporting SQL92 at the moment, only


If we decide to go further with this approach i would like to work on the 
CHECK OPTION implementation based on some ideas we had in the past (e.g. 
rewriter/executor support).


Note that i'm still working on this (for example, RETURNING is missing 
yet), As always, discussion welcome ;)


--
 Thanks

   Bernd


view_update.patch.bz2
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG_PAGE_LAYOUT_VERSION 5 - time for change

2008-10-30 Thread Gregory Stark
Zdenek Kotala <[EMAIL PROTECTED]> writes:

> 3) TOAST modification
>   a) TOAST table per attribute
>   b) replace chunk id with offset+variable chunk size
>   c) add column identification into first chunk
>
> Thats all. I think infomask/infomask2 shuffle flag should be done. TOAST
> modification complicates in-place upgrade.

I don't think TOAST table per attribute is feasible You would end up with
thousands of toast tables. It might be interesting as an option if you plan to
drop the column but I don't see it as terribly interesting.

What seemed to make sense to me for solving your problem was including the
type oid in the toast chunks. I suppose attribute number might be just as good
-- it would let you save upgrading chunks for dropped columns at the expense
of having to look up the column info first.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] User defined I/O conversion casts

2008-10-30 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Yeah, a magical OID clearly has some issues. A new field in pg_cast is 
the obvious alternative. How about adding a "castmethod" char field, 
with values:

b = binary-compatible cast (CREATE CAST ... WITHOUT FUNCTION)
i = I/O coercion cast (the new beast, CREATE CAST ... WITH INOUT)
f = use cast function specified in castfunc field.
castfunc is 0 for methods b and i.


Seems sane to me.  If you do that, please add a check in the opr_sanity
regression test that castfunc is nonzero if and only if castmethod is f.


Here's a patch. It seems pretty straightforward, I'll apply this 
tomorrow, barring objections. Note to self: bump the catalog version.


BTW, it looks like we don't have any test cases for the CREATE CAST 
command. I think I'll add one.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** doc/src/sgml/catalogs.sgml
--- doc/src/sgml/catalogs.sgml
***
*** 1415,1423 
 cannot be deduced from some generic rule.  For example, casting between a
 domain and its base type is not explicitly represented in
 pg_cast.  Another important exception is that
!I/O conversion casts, those performed using a data type's own
!I/O functions to convert to or from text or other string types,
!are not explicitly represented in pg_cast.

  

--- 1415,1424 
 cannot be deduced from some generic rule.  For example, casting between a
 domain and its base type is not explicitly represented in
 pg_cast.  Another important exception is that
!automatic I/O conversion casts, those performed using a data
!type's own I/O functions to convert to or from text or other
!string types, are not explicitly represented in
!pg_cast.

  

***
*** 1454,1461 
pg_proc.oid

 The OID of the function to use to perform this cast.  Zero is
!stored if the data types are binary coercible (that is, no
!run-time operation is needed to perform the cast)

   
  
--- 1455,1461 
pg_proc.oid

 The OID of the function to use to perform this cast.  Zero is
!stored if the cast method doesn't require a function.

   
  
***
*** 1473,1478 
--- 1473,1489 
 other cases

   
+  
+   castmethod
+   char
+   
+   
+Indicates how the cast is performed.
+f means that the function specified in the castfunc field is used.
+i means that the input/output functions are used.
+b means that the types are binary-coercible, thus no conversion is required
+   
+  
  
 

*** doc/src/sgml/ref/create_cast.sgml
--- doc/src/sgml/ref/create_cast.sgml
***
*** 24,29  CREATE CAST (sourcetype AS targettypesourcetype AS targettype)
  WITHOUT FUNCTION
  [ AS ASSIGNMENT | AS IMPLICIT ]
+ 
+ CREATE CAST (sourcetype AS targettype)
+ WITH INOUT
+ [ AS ASSIGNMENT | AS IMPLICIT ]
  
   
  
***
*** 59,64  SELECT CAST(42 AS float8);
--- 63,75 

  

+You can define a cast as an I/O conversion cast using
+the WITH INOUT syntax. An I/O conversion cast is
+performed by invoking the output function of the source data type, and
+passing the result to the input function of the target data type.
+   
+ 
+   
 By default, a cast can be invoked only by an explicit cast request,
 that is an explicit CAST(x AS
 typename) or
***
*** 200,205  SELECT CAST ( 2 AS numeric ) + 4.0;
--- 211,228 
  
  
  
+  WITH INOUT
+ 
+  
+   
+Indicates that the cast is an I/O conversion cast, performed by
+invoking the output function of the source data type, and passing the
+result to the input function of the target data type.
+   
+  
+ 
+ 
+ 
   AS ASSIGNMENT
  
   
***
*** 284,296  SELECT CAST ( 2 AS numeric ) + 4.0;
 It is normally not necessary to create casts between user-defined types
 and the standard string types (text, varchar, and
 char(n), as well as user-defined types that
!are defined to be in the string category).  PostgreSQL will
!automatically handle a cast to a string type by invoking the other
!type's output function, or conversely handle a cast from a string type
!by invoking the other type's input function.  These
!automatically-provided casts are known as I/O conversion
!casts.  I/O conversion casts to string types are treated as
!assignment casts, while I/O conversion casts from string types are
 explicit-only.  You can override this behavior by declaring your own
 cast to replace an I/O conversion cast, but usually the only reason to
 do so is if you want the conversion to be more easily invokable than the
--- 307,316 
 It is normally not necessary to create casts

[HACKERS] Updated posix fadvise patch v19

2008-10-30 Thread Gregory Stark

Here is an updated posix_fadvise patch against CVS HEAD. It's my first patch
generated using git, yay. I remembered to strip out configure (WHY is that
STILL in CVS!?) and convert it to context diff, but if there's anything else
I've missed just shout.

Changes from before:

1) Based on discussions changed the parameter to effective_io_concurrency and
   modified the documentation discussion to include comments about SSDs and
   experimentation being necessary.

2) standardized all references to "prefetch" from the mixture of same and
   "preread" and "advise". This means the main entry point is now
   PrefetchBuffer().

3) Added run-time check for buggy glibc versions with bad posix_fadvise()
   since it was pointed out that someone could run on a different version of
   glibc than it was compiled on (such as with binary installers). I kept the
   autoconf test though arguably it's redundant now and could be removed.

Also, the previous patch included support for Zoltan's situation using
POSIX_FADV_SEQUENTIAL for bulk sequential scans. I experimented with this and
was not able to find any situation it improved. I've left it in so others can
test and show it's helpful. This bit of code is a bit less polished -- it's
missing documentation for the guc variable and there are some #defines which
are probably in the wrong .h file (and probably should be an enum).

This code is entirely separate from the prefetching code. They happen to both
use posix_fadvise but I don't think they belong in the same abstraction for
Postgres. PrefetchBuffer does just one thing -- prefetches a buffer.
Sequential i/o needs to be set and tracked per-file.

I have not changed the query costing side of things. I'm a bit leery about
making those changes as I fear it will lead me to propose a major reworking of
these parameters and I don't want to go there...




posix-fadvise-v19.patch.gz
Description: Binary data

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PG_PAGE_LAYOUT_VERSION 5 - time for change

2008-10-30 Thread Zdenek Kotala
It seems that we are going to bump Page Layout Version to version 5 (see CRC 
patch for detail). Maybe it is good time to do some other changes. There is a 
list of ideas (please, do not beat me :-). Some of them we discussed in Prato 
and Greg maybe have more.


1) HeapTupleHeader modification

typedef struct HeapTupleFields
{
TransactionId t_xmin;   /* inserting xact ID */
TransactionId t_xmax;   /* deleting or locking xact ID */

union
{
CommandId   t_cid;
TransactionId   t_xvac;   /* VACUUM FULL xact ID */
}   t_field3;
 uint16  t_infomask;
} HeapTupleFields;

typedef struct HeapTupleHeaderData
{
union
{
HeapTupleFields t_heap;
DatumTupleFields t_datum;
}   t_choice;

ItemPointerData t_ctid; /* current TID of this or newer tuple */

/* Fields below here must match MinimalTupleData! */

uint16  t_infomask2;
uint8   t_hoff;

/* ^ - 23 bytes - ^ */

bits8   t_bits[1];
} HeapTupleHeaderData;

This also requires shuffle flags between infomask and infomask2. infomask2 
should have only flag: HASNULL,HASOID,HASVARWIDTH and HASEXTERNAL And minimal 
tuple does not need infomask field which will contains only transaction hint 
bits. Unfortunately, structure alligment is not much friendly.


2) Add page type (e.g. btree) and subtype (e.g. metapage) flag into page header. 
I think It will be useful when we will use share buffer for clog.


3) TOAST modification
  a) TOAST table per attribute
  b) replace chunk id with offset+variable chunk size
  c) add column identification into first chunk

Thats all. I think infomask/infomask2 shuffle flag should be done. TOAST 
modification complicates in-place upgrade.


Comments other ideas?

Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements

2008-10-30 Thread Martin Pihlak
ITAGAKI Takahiro wrote:
>> But is the idea of extending QueryDesc generally acceptable? Is it OK
>> to make a copy of the query string?
> 
> The only thing I'm worried about is that QueryDesc lives longer than
> its queryText. Can I assume it never occurs?
> 

I just finished validating this -- it seems OK. All of the query strings
that make it to CreateQueryDesc are either pstrdup-ed to Portal, in long
lived memory context or just literals. So no need to make extra copies :)

regards,
Martin

*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***
*** 1056,1062  DoCopy(const CopyStmt *stmt, const char *queryString)
  		/* Create a QueryDesc requesting no output */
  		cstate->queryDesc = CreateQueryDesc(plan, GetActiveSnapshot(),
  			InvalidSnapshot,
! 			dest, NULL, false);
  
  		/*
  		 * Call ExecutorStart to prepare the plan for execution.
--- 1056,1062 
  		/* Create a QueryDesc requesting no output */
  		cstate->queryDesc = CreateQueryDesc(plan, GetActiveSnapshot(),
  			InvalidSnapshot,
! 			dest, NULL, false, queryString);
  
  		/*
  		 * Call ExecutorStart to prepare the plan for execution.
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 172,178  ExplainOneQuery(Query *query, ExplainStmt *stmt, const char *queryString,
  		plan = pg_plan_query(query, 0, params);
  
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, params, stmt, tstate);
  	}
  }
  
--- 172,178 
  		plan = pg_plan_query(query, 0, params);
  
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, params, stmt, tstate, queryString);
  	}
  }
  
***
*** 219,225  ExplainOneUtility(Node *utilityStmt, ExplainStmt *stmt,
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate)
  {
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
--- 219,226 
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate,
! 			   const char *queryString)
  {
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
***
*** 238,244  ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
  	queryDesc = CreateQueryDesc(plannedstmt,
  GetActiveSnapshot(), InvalidSnapshot,
  None_Receiver, params,
! stmt->analyze);
  
  	INSTR_TIME_SET_CURRENT(starttime);
  
--- 239,245 
  	queryDesc = CreateQueryDesc(plannedstmt,
  GetActiveSnapshot(), InvalidSnapshot,
  None_Receiver, params,
! stmt->analyze, queryString);
  
  	INSTR_TIME_SET_CURRENT(starttime);
  
*** a/src/backend/commands/prepare.c
--- b/src/backend/commands/prepare.c
***
*** 697,703  ExplainExecuteQuery(ExecuteStmt *execstmt, ExplainStmt *stmt,
  pstmt->intoClause = execstmt->into;
  			}
  
! 			ExplainOnePlan(pstmt, paramLI, stmt, tstate);
  		}
  		else
  		{
--- 697,703 
  pstmt->intoClause = execstmt->into;
  			}
  
! 			ExplainOnePlan(pstmt, paramLI, stmt, tstate, queryString);
  		}
  		else
  		{
*** a/src/backend/executor/functions.c
--- b/src/backend/executor/functions.c
***
*** 309,320  postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
  		es->qd = CreateQueryDesc((PlannedStmt *) es->stmt,
   snapshot, InvalidSnapshot,
   None_Receiver,
!  fcache->paramLI, false);
  	else
  		es->qd = CreateUtilityQueryDesc(es->stmt,
  		snapshot,
  		None_Receiver,
! 		fcache->paramLI);
  
  	/* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
  
--- 309,320 
  		es->qd = CreateQueryDesc((PlannedStmt *) es->stmt,
   snapshot, InvalidSnapshot,
   None_Receiver,
!  fcache->paramLI, false, fcache->src);
  	else
  		es->qd = CreateUtilityQueryDesc(es->stmt,
  		snapshot,
  		None_Receiver,
! 		fcache->paramLI, fcache->src);
  
  	/* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
  
*** a/src/backend/executor/spi.c
--- b/src/backend/executor/spi.c
***
*** 1795,1801  _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
  qdesc = CreateQueryDesc((PlannedStmt *) stmt,
  		snap, crosscheck_snapshot,
  		dest,
! 		paramLI, false);
  res = _SPI_pquery(qdesc, fire_triggers,
    canSetTag ? tcount : 0);
  FreeQueryDesc(qdesc);
--- 1795,1802 
  qdesc = CreateQueryDesc((PlannedStmt *) stmt,
  		snap, crosscheck_snapshot,
  		dest,
! 		paramLI, false,
! 		plansource->query_string);
  res = _SPI_pquery(qdesc, fire_triggers,
    canSetTag ? tcount : 0);
  FreeQueryDesc(qdesc);
*** a/src/backend/tcop/pquery.c
--- b/src/backend/tcop/pquery.c
***
*** 37,43  Portal		ActivePortal = NULL;
  static void ProcessQ

Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Martijn van Oosterhout
On Thu, Oct 30, 2008 at 03:41:17PM +, Gregory Stark wrote:
> The CRC is chosen such that if you CRC the resulting packet including the CRC
> you get a CRC of 0. That can be done for whatever offset the CRC appears at I
> believe.

IIRC, you calculate the CRC-32 of the page, then XOR it over where it's
supposed to end up. No need to preseed (or more accurately, it doesn't
matter how you preseed, the result is the same).

For checking it doesn't matter either, just checksum the page and if
you get zero it's correct.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Alvaro Herrera
Alvaro Herrera wrote:

> Hmm, oh I see another problem here -- the bit is not restored when
> replayed heap_update's WAL record.  I'm now wondering what other bits
> are set without much care about correctly restoring them during replay.

I'm now wondering whether it'd be easier to just ignore pd_flags in
calculating the checksum.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris
On Thu, Oct 30, 2008 at 12:14 PM, Alvaro Herrera
<[EMAIL PROTECTED]> wrote:
> Gregory Stark escribió:
>
>> What I'm wondering though -- are we going to make CRCs mandatory? Or set 
>> aside
>> the 4 bytes even if you're not using them? Because if the size of the page
>> header varies depending on whether you're using CRCs that sounds like it 
>> would
>> be quite a pain.
>
> Not mandatory, but the space needs to be set aside.  (Otherwise you
> couldn't turn it on after running with it turned off, which would rule
> out using the database after initdb).

Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-30 Thread Robert Haas
> With session variables we could implement srf function in plpgsql like
> current C srf function. Like
>
> create or replace function foo()
> returns record as $$
> #option with_srf_context(datatype of srf context)
> begin
>   return row(...);
> end;
> $$ language plpgsql;

Oh, sure - but what you can do with this will be somewhat limited
compared to a Perl hash reference off which you can chain any
arbitrary data structure with ease.  I'd want to see an actual use
case for this before anyone bothered implementing it.  I was actually
thinking one way to do it would be to extend the variable declaration
syntax so that you could declare n>=0 variables as SRF context
variables, which I think is nicer, but even that I think is of limited
usefulness.  I think the biggest value of PL/plgsql is the ability to
RETURN QUERY, and I think the ability to push a lazy execution model
down into that subordinate query is where the win is.  That case won't
be helped at all by this sort of alternate calling convention - in
fact it'll be nearly impossible to even do that at all with this type
of execution model.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] HeapTuple version extension + code cleanup

2008-10-30 Thread Zdenek Kotala
This patch requires patch: 
http://archives.postgresql.org/message-id/[EMAIL PROTECTED]


It add t_ver information into HeapTuple structure and add DatumGetHeapTuple 
function which allows to reduce code on many places.


Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql



heaptuple_ver.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Alvaro Herrera
Gregory Stark escribió:

> What I'm wondering though -- are we going to make CRCs mandatory? Or set aside
> the 4 bytes even if you're not using them? Because if the size of the page
> header varies depending on whether you're using CRCs that sounds like it would
> be quite a pain.

Not mandatory, but the space needs to be set aside.  (Otherwise you
couldn't turn it on after running with it turned off, which would rule
out using the database after initdb).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about GetAttributeByNum(Name) ExecQual.c

2008-10-30 Thread Tom Lane
Zdenek Kotala <[EMAIL PROTECTED]> writes:
> Does it mean that these function are every time called with heap tuple or 
> untoasted row type (composite data type)? What is purpose of these function. 

Legacy support for third-party modules.  They're really pretty much
deprecated but I don't foresee removing them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Zdenek Kotala

Tom Lane napsal(a):

Zdenek Kotala <[EMAIL PROTECTED]> writes:
By the way, do you need CRC as a first page member? Is it for future development 
like CLOG integration into buffers? Why not put it on the end as and mark it as 
a special? It will reduce space requirement when CRC is not enabled.


... and make life tremendously more complex for indexes, 


Indexes  uses PageGetSpecial macro and they live with them and PageInit could do 
 correct placement. Only problem are assert macros and extra check which 
verifies correct size of special.



plus turning
CRC checking on or off on-the-fly would be problematic.  


Yeah, it is problem.


I think Alvaro
has the right idea: just put the field there all the time.


Agree.

Zdenek



--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Tom Lane
Zdenek Kotala <[EMAIL PROTECTED]> writes:
> By the way, do you need CRC as a first page member? Is it for future 
> development 
> like CLOG integration into buffers? Why not put it on the end as and mark it 
> as 
> a special? It will reduce space requirement when CRC is not enabled.

... and make life tremendously more complex for indexes, plus turning
CRC checking on or off on-the-fly would be problematic.  I think Alvaro
has the right idea: just put the field there all the time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-30 Thread Tom Lane
I wrote:
> I'm currently going to have a look at just what it would take to support
> both lazy and eager evaluation in functions.c (independently of what
> syntax, if any, we settle on to expose the choice to the user).  If it's
> either really awful or really easy we should know that before arguing
> further.

Attached is a draft patch that allows SQL functions to return sets using
either value-per-call or materialize mode.  It does not expose any
control to the user; for the moment, the choice is driven by whether the
call site is ExecMakeFunctionResult (which prefers value-per-call) or
ExecMakeTableFunctionResult (which prefers materialize).  I estimate
that functions.c is two or three hundred lines longer than it would be
if we stripped the value-per-call support and simplified the logic down
to what I had in my prior patch.  Which is not all that much in the
big scheme of things, so I'll withdraw my argument for simplifying.

I'm not sure if it's worth adding a control knob or not --- it's still
true that materialize is faster on a tuple-by-tuple basis, but whether
the difference is all that significant for nontrivial queries is
debatable.  Anyway I don't really want to work on that point right now.
The next step is to make it actually support RETURNING queries, and if
I don't get on with that I won't finish it before commitfest.

regards, tom lane



binpZLb0Tylz0.bin
Description: sql-functions-2.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Zdenek Kotala

Alvaro Herrera napsal(a):

Zdenek Kotala wrote:

Alvaro Herrera napsal(a):

Simon Riggs wrote:


But perhaps writing a single WAL record if you scan whole page and set
all bits at once. Then it makes sense in some cases.

So this is what I ended up doing; attached.

Please, DO NOT MOVE position of page version in PageHeader structure!


Hmm.  The only way I see we could do that is to modify the checksum
struct member to a predefined value before calculating the page's
checksum.

Ah, actually there's another alternative -- leave the checksum on its
current position (start of struct) and move other members below
pg_pagesize_version (leaning towards pd_tli and pd_flags).  That'd leave
the page version in the same position.

(Hmm, maybe it's better to move pd_lower and pd_upper?)


No, please, keep pd_lower and pd_upper on same position. They are accessed more 
often than pd_tli and pd_flags. It is better for optimization.


By the way, do you need CRC as a first page member? Is it for future development 
like CLOG integration into buffers? Why not put it on the end as and mark it as 
a special? It will reduce space requirement when CRC is not enabled.


Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Gregory Stark
Alvaro Herrera <[EMAIL PROTECTED]> writes:

> Ah, actually there's another alternative -- leave the checksum on its
> current position (start of struct) and move other members below
> pg_pagesize_version (leaning towards pd_tli and pd_flags).  That'd leave
> the page version in the same position.

I don't understand why the position of anything matters here. Look at TCP
packets for instance, the checksum is not at the beginning or end of anything.

The CRC is chosen such that if you CRC the resulting packet including the CRC
you get a CRC of 0. That can be done for whatever offset the CRC appears at I
believe.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R pacth

2008-10-30 Thread Markus Wanner
Hi,

Imre Geczy wrote:
> What kind of form or method must be used to patch that it can work correctly?

I finally got around writing some installation instructions:
http://www.postgres-r.org/documentation/installation

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris
On Thu, Oct 30, 2008 at 11:27 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Jonah H. Harris" <[EMAIL PROTECTED]> writes:
>> On Thu, Oct 30, 2008 at 11:14 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>>> Well, yeah, but it has to be able to tell which version it's dealing
>>> with.  I quite agree with Zdenek that keeping the version indicator
>>> in a fixed location is appropriate.
>
>> Most of the other databases I've worked, which don't have different
>> types of pages, put the page version as the first element of the page.
>>  That would let us put the crc right after it.  Thoughts?
>
> "Fixed location" does not mean "let's move it".

Just trying to be helpful.  Just thought I might give some insight as
to what others, who had implemented in-place upgrade functionality
years before Postgres' existence, had done.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Zdenek Kotala

Jonah H. Harris napsal(a):

On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala <[EMAIL PROTECTED]> wrote:

Please, DO NOT MOVE position of page version in PageHeader structure! And
PG_PAGE_LAYOUT_VERSION should be bump to 5.


Umm, any in-place upgrade should be capable of handling changes to the
page header.  Of, did I miss something significant in the in-place
upgrade design?


Not any change. If you move page header version field to another position it 
will require kind of magic to detect what version it is. Other field you can 
place everywhere :-), but do not touch page version. It will brings a lot of 
problems...


Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Tom Lane
"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> On Thu, Oct 30, 2008 at 11:14 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> Well, yeah, but it has to be able to tell which version it's dealing
>> with.  I quite agree with Zdenek that keeping the version indicator
>> in a fixed location is appropriate.

> Most of the other databases I've worked, which don't have different
> types of pages, put the page version as the first element of the page.
>  That would let us put the crc right after it.  Thoughts?

"Fixed location" does not mean "let's move it".

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Alvaro Herrera
Gregory Stark wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> 
> > Alvaro Herrera wrote:
> >
> >> The "other hint bits" are:
> >> 
> >> - LP_DEAD as used by the various callers of ItemIdMarkDead.
> >> - PD_PAGE_FULL
> >> - BTPageOpaque->btpo_flags and btpo_cycleid
> >> 
> >> All of them are changed with only SetBufferCommitInfoNeedsSave being
> >> called afterwards.
> >
> > I think we could get away with WAL-logging LP_DEAD via ItemIdMarkDead
> > similar to what is done to SetHintBits in the posted patch, and cope
> > with the rest by marking the page with the invalid checksum; they are
> > not so frequent anyway so the reliability loss is low.
> 
> If PD_PAGE_FULL is set and that causes the crc to be set to the invalid sum
> will we ever get another chance to set it?

I should have qualified that a bit more.  It's not setting PD_FULL
that's not logged, but clearing it (heap_prune_page, line 282).  It's
set in heap_update.

Hmm, oh I see another problem here -- the bit is not restored when
replayed heap_update's WAL record.  I'm now wondering what other bits
are set without much care about correctly restoring them during replay.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris
On Thu, Oct 30, 2008 at 11:14 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Jonah H. Harris" <[EMAIL PROTECTED]> writes:
>> On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala <[EMAIL PROTECTED]> wrote:
>>> Please, DO NOT MOVE position of page version in PageHeader structure! And
>>> PG_PAGE_LAYOUT_VERSION should be bump to 5.
>
>> Umm, any in-place upgrade should be capable of handling changes to the
>> page header.
>
> Well, yeah, but it has to be able to tell which version it's dealing
> with.  I quite agree with Zdenek that keeping the version indicator
> in a fixed location is appropriate.

Most of the other databases I've worked, which don't have different
types of pages, put the page version as the first element of the page.
 That would let us put the crc right after it.  Thoughts?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Tom Lane
"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala <[EMAIL PROTECTED]> wrote:
>> Please, DO NOT MOVE position of page version in PageHeader structure! And
>> PG_PAGE_LAYOUT_VERSION should be bump to 5.

> Umm, any in-place upgrade should be capable of handling changes to the
> page header.

Well, yeah, but it has to be able to tell which version it's dealing
with.  I quite agree with Zdenek that keeping the version indicator
in a fixed location is appropriate.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Alvaro Herrera
Zdenek Kotala wrote:
> Alvaro Herrera napsal(a):
>> Simon Riggs wrote:
>>
>>> But perhaps writing a single WAL record if you scan whole page and set
>>> all bits at once. Then it makes sense in some cases.
>>
>> So this is what I ended up doing; attached.
>
> Please, DO NOT MOVE position of page version in PageHeader structure!

Hmm.  The only way I see we could do that is to modify the checksum
struct member to a predefined value before calculating the page's
checksum.

Ah, actually there's another alternative -- leave the checksum on its
current position (start of struct) and move other members below
pg_pagesize_version (leaning towards pd_tli and pd_flags).  That'd leave
the page version in the same position.

(Hmm, maybe it's better to move pd_lower and pd_upper?)

> And PG_PAGE_LAYOUT_VERSION should be bump to 5.

Easily done; thanks for the reminder.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
*** src/include/storage/bufpage.h	14 Jul 2008 03:22:32 -	1.83
--- src/include/storage/bufpage.h	30 Oct 2008 15:05:24 -
***
*** 17,22 
--- 17,23 
  #include "access/xlogdefs.h"
  #include "storage/item.h"
  #include "storage/off.h"
+ #include "utils/pg_crc.h"
  
  /*
   * A postgres disk page is an abstraction layered on top of a postgres
***
*** 87,92 
--- 88,94 
   *
   * space management information generic to any page
   *
+  *		pd_checksum	- the checksum of the page
   *		pd_lsn		- identifies xlog record for last change to this page.
   *		pd_tli		- ditto.
   *		pd_flags	- flag bits.
***
*** 118,136 
   * the constraint on pagesize mod 256 is not an important restriction.
   * On the high end, we can only support pages up to 32KB because lp_off/lp_len
   * are 15 bits.
   */
  typedef struct PageHeaderData
  {
! 	/* XXX LSN is member of *any* block, not only page-organized ones */
  	XLogRecPtr	pd_lsn;			/* LSN: next byte after last byte of xlog
   * record for last change to this page */
- 	uint16		pd_tli;			/* least significant bits of the TimeLineID
-  * containing the LSN */
- 	uint16		pd_flags;		/* flag bits, see below */
  	LocationIndex pd_lower;		/* offset to start of free space */
  	LocationIndex pd_upper;		/* offset to end of free space */
  	LocationIndex pd_special;	/* offset to start of special space */
  	uint16		pd_pagesize_version;
  	TransactionId pd_prune_xid; /* oldest prunable XID, or zero if none */
  	ItemIdData	pd_linp[1];		/* beginning of line pointer array */
  } PageHeaderData;
--- 120,143 
   * the constraint on pagesize mod 256 is not an important restriction.
   * On the high end, we can only support pages up to 32KB because lp_off/lp_len
   * are 15 bits.
+  *
+  * Note that pd_tli appears in a rather awkward position in the struct;
+  * this is because we moved it to accomodate pd_checksum without changing
+  * pg_pagesize_version's offset.
   */
  typedef struct PageHeaderData
  {
! 	/* XXX CRC & LSN are members of *any* block, not only page-organized ones */
! 	pg_crc32	pd_checksum;/* The block-level checksum */
  	XLogRecPtr	pd_lsn;			/* LSN: next byte after last byte of xlog
   * record for last change to this page */
  	LocationIndex pd_lower;		/* offset to start of free space */
  	LocationIndex pd_upper;		/* offset to end of free space */
  	LocationIndex pd_special;	/* offset to start of special space */
  	uint16		pd_pagesize_version;
+ 	uint16		pd_tli;			/* least significant bits of the TimeLineID
+  * containing the LSN */
+ 	uint16		pd_flags;		/* flag bits, see below */
  	TransactionId pd_prune_xid; /* oldest prunable XID, or zero if none */
  	ItemIdData	pd_linp[1];		/* beginning of line pointer array */
  } PageHeaderData;
***
*** 148,159 
   * PD_PAGE_FULL is set if an UPDATE doesn't find enough free space in the
   * page for its new tuple version; this suggests that a prune is needed.
   * Again, this is just a hint.
   */
  #define PD_HAS_FREE_LINES	0x0001		/* are there any unused line pointers? */
  #define PD_PAGE_FULL		0x0002		/* not enough free space for new
  		 * tuple? */
  
! #define PD_VALID_FLAG_BITS	0x0003		/* OR of all valid pd_flags bits */
  
  /*
   * Page layout version number 0 is for pre-7.3 Postgres releases.
--- 155,172 
   * PD_PAGE_FULL is set if an UPDATE doesn't find enough free space in the
   * page for its new tuple version; this suggests that a prune is needed.
   * Again, this is just a hint.
+  *
+  * PG_UNLOGGED_CHANGE indicates whether a process has set hint bits on the
+  * page.  This is used to determine whether a WAL message needs to be emitted
+  * before writing the page to disk when page checksums are enabled.
   */
  #define PD_HAS_FREE_LINES	0x0001		/* are there any unused line pointers? */
  #define PD_PAGE_FULL		0x0002		/* not enough free space for new
  		 * tuple? */
+ #define PD_UNLOGGED_CHANG

Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Gregory Stark
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:

> Jonah H. Harris wrote:
>> On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala <[EMAIL PROTECTED]> wrote:
>>> Please, DO NOT MOVE position of page version in PageHeader structure! And
>>> PG_PAGE_LAYOUT_VERSION should be bump to 5.
>>
>> Umm, any in-place upgrade should be capable of handling changes to the
>> page header.  Of, did I miss something significant in the in-place
>
> I thought that was kind of the point of in place upgrade.

Sure, but he has to have a reliable way to tell what version of the page
header he's looking at...

What I'm wondering though -- are we going to make CRCs mandatory? Or set aside
the 4 bytes even if you're not using them? Because if the size of the page
header varies depending on whether you're using CRCs that sounds like it would
be quite a pain.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Joshua D. Drake

Jonah H. Harris wrote:

On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala <[EMAIL PROTECTED]> wrote:

Please, DO NOT MOVE position of page version in PageHeader structure! And
PG_PAGE_LAYOUT_VERSION should be bump to 5.


Umm, any in-place upgrade should be capable of handling changes to the
page header.  Of, did I miss something significant in the in-place


I thought that was kind of the point of in place upgrade.

Joshua D. Drake

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris
On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala <[EMAIL PROTECTED]> wrote:
> Please, DO NOT MOVE position of page version in PageHeader structure! And
> PG_PAGE_LAYOUT_VERSION should be bump to 5.

Umm, any in-place upgrade should be capable of handling changes to the
page header.  Of, did I miss something significant in the in-place
upgrade design?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Zdenek Kotala

Alvaro Herrera napsal(a):

Simon Riggs wrote:


But perhaps writing a single WAL record if you scan whole page and set
all bits at once. Then it makes sense in some cases.


So this is what I ended up doing; attached.

There are some gotchas in this patch:



Please, DO NOT MOVE position of page version in PageHeader structure! And 
PG_PAGE_LAYOUT_VERSION should be bump to 5.


Thanks Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sun Studio compiler warnings

2008-10-30 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> CMPFUNC(tsquery_lt, res < 0);
> CMPFUNC(tsquery_le, res <= 0);
> CMPFUNC(tsquery_eq, res == 0);
> CMPFUNC(tsquery_ge, res >= 0);
> CMPFUNC(tsquery_gt, res > 0);
> CMPFUNC(tsquery_ne, res != 0);

> The closing semicolon is strictly speaking not allowed here.  We could 
> remove it, but that would probably upset pgindent?

If the warnings annoy you, do what PG_FUNCTION_INFO_V1 does.

#define PG_FUNCTION_INFO_V1(funcname) \
extern PGDLLIMPORT const Pg_finfo_record * CppConcat(pg_finfo_,funcname)(void); 
\
const Pg_finfo_record * \
CppConcat(pg_finfo_,funcname) (void) \
{ \
static const Pg_finfo_record my_finfo = { 1 }; \
return &my_finfo; \
} \
extern int no_such_variable


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Sun Studio compiler warnings

2008-10-30 Thread Peter Eisentraut

Sun Studio is producing these compiler warnings (among others):

"tsquery_op.c", line 193: warning: syntax error:  empty declaration 
"tsquery_op.c", line 194: warning: syntax error:  empty declaration 
"tsquery_op.c", line 195: warning: syntax error:  empty declaration 
"tsquery_op.c", line 196: warning: syntax error:  empty declaration 
"tsquery_op.c", line 197: warning: syntax error:  empty declaration 
"tsquery_op.c", line 198: warning: syntax error:  empty declaration


"tsvector_op.c", line 177: warning: syntax error:  empty declaration 
"tsvector_op.c", line 178: warning: syntax error:  empty declaration 
"tsvector_op.c", line 179: warning: syntax error:  empty declaration 
"tsvector_op.c", line 180: warning: syntax error:  empty declaration 
"tsvector_op.c", line 181: warning: syntax error:  empty declaration 
"tsvector_op.c", line 182: warning: syntax error:  empty declaration 
"tsvector_op.c", line 183: warning: syntax error:  empty declaration


This relates to the following sort of code:

#define CMPFUNC( NAME, CONDITION )  \
Datum   \
NAME(PG_FUNCTION_ARGS) {\
TSQuery  a = PG_GETARG_TSQUERY_COPY(0); \
TSQuery  b = PG_GETARG_TSQUERY_COPY(1); \
int res = CompareTSQ(a,b);  \
\
PG_FREE_IF_COPY(a,0);   \
PG_FREE_IF_COPY(b,1);   \
\
PG_RETURN_BOOL( CONDITION );\
}

CMPFUNC(tsquery_lt, res < 0);
CMPFUNC(tsquery_le, res <= 0);
CMPFUNC(tsquery_eq, res == 0);
CMPFUNC(tsquery_ge, res >= 0);
CMPFUNC(tsquery_gt, res > 0);
CMPFUNC(tsquery_ne, res != 0);

The closing semicolon is strictly speaking not allowed here.  We could 
remove it, but that would probably upset pgindent?


I recall that we used to have a bunch of similar problems with the AIX 
compilers a long time ago.  Does anyone recall the solution, and do we 
still care?  (Note that it's only a warning in this case.)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Gregory Stark
Alvaro Herrera <[EMAIL PROTECTED]> writes:

> Alvaro Herrera wrote:
>
>> The "other hint bits" are:
>> 
>> - LP_DEAD as used by the various callers of ItemIdMarkDead.
>> - PD_PAGE_FULL
>> - BTPageOpaque->btpo_flags and btpo_cycleid
>> 
>> All of them are changed with only SetBufferCommitInfoNeedsSave being
>> called afterwards.
>
> I think we could get away with WAL-logging LP_DEAD via ItemIdMarkDead
> similar to what is done to SetHintBits in the posted patch, and cope
> with the rest by marking the page with the invalid checksum; they are
> not so frequent anyway so the reliability loss is low.

If PD_PAGE_FULL is set and that causes the crc to be set to the invalid sum
will we ever get another chance to set it?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Please make sure your patches are on the wiki page

2008-10-30 Thread Robert Haas
>> (1) moving all of the patches committed prior to 11/1 to a separate
>> section or page
>
> Why?

To reduce clutter, but I don't feel strongly about it.

>> (2) sorting the pending patches by complexity or subject matter
>
> Sorting them by complexity would be great, if I thought I could do it.  I'm
> not sure I can.

I think the biggest patches for this commitfest are SEPostgresql,
Simon Riggs' work on hot standby (which is not on the commitfest page
yet and probably supersedes some of the earlier patches that are still
on there), window functions, DDL lock strength reduction (not sure how
big it is but I would guess it probably has to be reviewed by -core),
parallel restore, and maybe grouping sets (though it seems like that
is a ways from being committable).  I agree with Tom that it would
probably be good to try to get these (and any other big ones that I
may have missed) reviewed by core early and committed or feedback
given quickly.  It may require some of the other patches to be
resnapped to head but that is probably the lesser of two evils.

One other random note - I don't believe there has been a new version
of the column-level permissions patch that Stephen Frost was working
on since the last commitfest.  Unless someone disagrees with Markus
Wanner's conclusion that it wasn't ready for commit at that point,
this one can probably be bounced from the get-go.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] User defined I/O conversion casts

2008-10-30 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Yeah, a magical OID clearly has some issues. A new field in pg_cast is 
the obvious alternative. How about adding a "castmethod" char field, 
with values:

b = binary-compatible cast (CREATE CAST ... WITHOUT FUNCTION)
i = I/O coercion cast (the new beast, CREATE CAST ... WITH INOUT)
f = use cast function specified in castfunc field.
castfunc is 0 for methods b and i.


Seems sane to me.  If you do that, please add a check in the opr_sanity
regression test that castfunc is nonzero if and only if castmethod is f.


Ok, will do.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] User defined I/O conversion casts

2008-10-30 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Yeah, a magical OID clearly has some issues. A new field in pg_cast is 
> the obvious alternative. How about adding a "castmethod" char field, 
> with values:
> b = binary-compatible cast (CREATE CAST ... WITHOUT FUNCTION)
> i = I/O coercion cast (the new beast, CREATE CAST ... WITH INOUT)
> f = use cast function specified in castfunc field.
> castfunc is 0 for methods b and i.

Seems sane to me.  If you do that, please add a check in the opr_sanity
regression test that castfunc is nonzero if and only if castmethod is f.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] User defined I/O conversion casts

2008-10-30 Thread Heikki Linnakangas

(Resurrecting an old thread.)

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:

Patch attached. I'm using a magic OID "1" in pg_cast.castfunc field to
mark these extra I/O conversion casts.


Ugh.  That's really unacceptable (doesn't it make the oidjoins
regression test fail?),


Yeah, a magical OID clearly has some issues. A new field in pg_cast is 
the obvious alternative. How about adding a "castmethod" char field, 
with values:

b = binary-compatible cast (CREATE CAST ... WITHOUT FUNCTION)
i = I/O coercion cast (the new beast, CREATE CAST ... WITH INOUT)
f = use cast function specified in castfunc field.

castfunc is 0 for methods b and i.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing COPY

2008-10-30 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> The basic idea is to replace the custom loop in CopyReadLineText with 
> memchr(), because memchr() is very fast. To make that possible, perform 
> the client-server encoding conversion on each raw block that we read in, 
> before splitting it into lines. That way CopyReadLineText only needs to 
> deal with server encodings, which is required for the memchr() to be safe.

Okay, so of course the trick with that is the block boundary handling.
The protocol says the client can break the data apart however it likes.
I see you've tried to deal with that, but this part seems wrong:

> ! if (convertable_bytes == 0)
> ! {
> ! /*
> !  * EOF, and there was some unconvertable chars 
> at the end.
> !  * Call pg_client_to_server on the remaining 
> bytes, to
> !  * let it throw an error.
> !  */
> ! cvt = pg_client_to_server(raw, inbytes);
> ! Assert(false); /* pg_client_to_server should've 
> errored */
> ! }

You're not (AFAICS) definitely at EOF here; you might just have gotten
a pathologically short message.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: Caches and Locks

2008-10-30 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Thu, 2008-10-30 at 08:30 -0400, Tom Lane wrote:
>> What about using the existing
>> syscache logic to re-derive inval information from watching the update
>> operations?

> That does sound possible, but it makes some big assumptions about
> transactional machinery being in place. It ain't. Subtransactions make
> everything about 5 times more difficult, so it seems pretty scary to me.

Um.  Yeah, subtransactions would be a PITA.  Never mind that then ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Alvaro Herrera
Alvaro Herrera wrote:

> The "other hint bits" are:
> 
> - LP_DEAD as used by the various callers of ItemIdMarkDead.
> - PD_PAGE_FULL
> - BTPageOpaque->btpo_flags and btpo_cycleid
> 
> All of them are changed with only SetBufferCommitInfoNeedsSave being
> called afterwards.

I think we could get away with WAL-logging LP_DEAD via ItemIdMarkDead
similar to what is done to SetHintBits in the posted patch, and cope
with the rest by marking the page with the invalid checksum; they are
not so frequent anyway so the reliability loss is low.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Optimizing COPY

2008-10-30 Thread Heikki Linnakangas
Back in March, I played around with various hacks to improve COPY FROM 
performance: 
http://archives.postgresql.org/pgsql-patches/2008-03/msg00145.php


I got busy with other stuff, but I now got around to try what I planned 
back then. I don't know if I have the time to finish this for 8.4, but 
might as well post what I've got.


The basic idea is to replace the custom loop in CopyReadLineText with 
memchr(), because memchr() is very fast. To make that possible, perform 
the client-server encoding conversion on each raw block that we read in, 
before splitting it into lines. That way CopyReadLineText only needs to 
deal with server encodings, which is required for the memchr() to be safe.


Attached is a quick patch for that. Think of it as a prototype; I 
haven't tested it much, and I feel that it needs some further cleanup. 
Quick testing suggests that it gives 0-20% speedup, depending on the 
data. Very narrow tables don't benefit much, but the wider the table, 
the bigger the gain. I haven't found a case where it performs worse.


I'm only using memchr() with non-csv format at the moment. It could be 
used for CSV as well, but it's more complicated because in CSV mode we 
need to keep track of the escapes.


Some collateral damage:
\. no longer works. If we only accept \. on a new line, like we already 
do in CSV mode, it shouldn't be hard or expensive to make it work again. 
The manual already suggests that we only accept it on a single line: 
"End of data can be represented by a single line containing just 
backslash-period (\.)."


Escaping a linefeed or carriage return by prepending it with a backslash 
no longer works. You have to use \n and \r. The manual already warns 
against doing that, so I think we could easily drop support for it. If 
we wanted, it wouldn't be very hard to make it work, though: after 
hitting a newline, scan back and count how many backslashes there is 
before the newline. An odd number means that it's an escaped newline, 
even number (including 0) means it's a real newline.


For the sake of simplifying the code, would anyone care if we removed 
support for COPY with protocol version 2?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** src/backend/commands/copy.c
--- src/backend/commands/copy.c
***
*** 147,153  typedef struct CopyStateData
  	 * can display it in error messages if appropriate.
  	 */
  	StringInfoData line_buf;
- 	bool		line_buf_converted;		/* converted to server encoding? */
  
  	/*
  	 * Finally, raw_buf holds raw data read from the data source (file or
--- 147,152 
***
*** 160,165  typedef struct CopyStateData
--- 159,168 
  	char	   *raw_buf;
  	int			raw_buf_index;	/* next byte to process */
  	int			raw_buf_len;	/* total # of bytes stored */
+ 
+ #define MAX_CONVERSION_GROWTH 4 /* from mbutils.c */
+ 	char		unconverted_buf[MAX_CONVERSION_GROWTH];
+ 	int			unconverted_buf_len;	/* total # of bytes stored */
  } CopyStateData;
  
  typedef CopyStateData *CopyState;
***
*** 248,253  static void CopyOneRowTo(CopyState cstate, Oid tupleOid,
--- 251,257 
  static void CopyFrom(CopyState cstate);
  static bool CopyReadLine(CopyState cstate);
  static bool CopyReadLineText(CopyState cstate);
+ static bool CopyReadLineCSV(CopyState cstate);
  static int CopyReadAttributesText(CopyState cstate, int maxfields,
  	   char **fieldvals);
  static int CopyReadAttributesCSV(CopyState cstate, int maxfields,
***
*** 673,680  CopyLoadRawBuf(CopyState cstate)
  	else
  		nbytes = 0;/* no data need be saved */
  
! 	inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes,
! 		  1, RAW_BUF_SIZE - nbytes);
  	nbytes += inbytes;
  	cstate->raw_buf[nbytes] = '\0';
  	cstate->raw_buf_index = 0;
--- 677,773 
  	else
  		nbytes = 0;/* no data need be saved */
  
! 	if (cstate->need_transcoding)
! 	{
! 		char	   *cvt;
! 		int			convertable_bytes;
! 		char	   *raw;
! 
! 		/*
! 		 * Read data directly to raw_buf. That way, if pg_client_to_server
! 		 * doesn't need to do any conversion after all, we avoid one memcpy.
! 		 * Prepend any unconverted bytes from the last batch first.
! 		 */
! 		raw = cstate->raw_buf + nbytes;
! 
! 		memcpy(raw, cstate->unconverted_buf, cstate->unconverted_buf_len);
! 
! 		inbytes = CopyGetData(cstate, raw + cstate->unconverted_buf_len, 1,
! 			  (RAW_BUF_SIZE - nbytes - cstate->unconverted_buf_len) / MAX_CONVERSION_GROWTH);
! 		inbytes += cstate->unconverted_buf_len;
! 
! 		if (inbytes > 0)
! 		{
! 			/*
! 			 * Determine the number of bytes that can be converted in this
! 			 * batch. 
! 			 */
! 			if (cstate->client_encoding == PG_UTF8)
! 			{
! /*
!  * Start from the end, and backtrack to the beginning of
!  * the last character. XXX: this should be generalized and
!  * pushed to wchar.c. I believe we could do the same for
!  * many other encodings too.
!  */
! convertable_bytes = inbytes;
! 		

Re: [HACKERS] Hot Standby: Caches and Locks

2008-10-30 Thread Simon Riggs

On Thu, 2008-10-30 at 08:30 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> >> We can't augment the commit/abort messages because
> >> we must cater for non-transactional invalidations also, plus commit
> >> xlrecs are already complex enough. So we log invalidations prior to
> >> commit, queue them and then trigger the send at commit (if it
> >> happens).
> 
> > Augmenting the commit messages seems like the better approach. It allows
> > invalidation messages to be fired as they are read off the xlrec. Still
> > need the additional message type to handle nontransactional
> > invalidation. There are other messages possibly more complex than this
> > already.
> 
> I guess I hadn't been paying attention, but: adding syscache inval
> traffic to WAL seems like a completely horrid idea, both from the
> complexity and performance standpoints.  

Well, it's coming out fairly simple actually. Can you explain where you
think the performance loss is? My expectation is less than a 0.1% WAL
volume overhead for typical systems. My comment this morning was to say
I've managed to augment the commit record, so we're not even sending
many additional messages.

It also makes much of the Hot Standby patch fairly simple, even if it is
large. Write something to WAL, act on it on the other side. I've paid
very close attention to minimising the effects on both the number of
lock acquisitions and total WAL volume, but having said that I expect
there to be many tuning opportunities.

> What about using the existing
> syscache logic to re-derive inval information from watching the update
> operations?

That does sound possible, but it makes some big assumptions about
transactional machinery being in place. It ain't. Subtransactions make
everything about 5 times more difficult, so it seems pretty scary to me.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about GetAttributeByNum(Name) ExecQual.c

2008-10-30 Thread Zdenek Kotala

Tom Lane napsal(a):

Zdenek Kotala <[EMAIL PROTECTED]> writes:
GetAttributeByNum has first parameter as HeapTupleHeader,  but most functions 
use Datum and after that they call DatumGetHeapTupleHeader. The difference is 
that DatumGetHeapTupleHeader performs detoast on tuple(row type).



Is it intention do not detoast in these functions or it is a bug?


You would certainly not want a tuple to get separately detoasted for
each attribute you pull from it.  So having detoasting here would be
the wrong thing IMHO.


Does it mean that these function are every time called with heap tuple or 
untoasted row type (composite data type)? What is purpose of these function. 
They are not use in executor and never in the core (only in example and 
regress). Should be really in executor?


Thanks Zdenek



--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about GetAttributeByNum(Name) ExecQual.c

2008-10-30 Thread Tom Lane
Zdenek Kotala <[EMAIL PROTECTED]> writes:
> GetAttributeByNum has first parameter as HeapTupleHeader,  but most functions 
> use Datum and after that they call DatumGetHeapTupleHeader. The difference is 
> that DatumGetHeapTupleHeader performs detoast on tuple(row type).

> Is it intention do not detoast in these functions or it is a bug?

You would certainly not want a tuple to get separately detoasted for
each attribute you pull from it.  So having detoasting here would be
the wrong thing IMHO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: Caches and Locks

2008-10-30 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
>> We can't augment the commit/abort messages because
>> we must cater for non-transactional invalidations also, plus commit
>> xlrecs are already complex enough. So we log invalidations prior to
>> commit, queue them and then trigger the send at commit (if it
>> happens).

> Augmenting the commit messages seems like the better approach. It allows
> invalidation messages to be fired as they are read off the xlrec. Still
> need the additional message type to handle nontransactional
> invalidation. There are other messages possibly more complex than this
> already.

I guess I hadn't been paying attention, but: adding syscache inval
traffic to WAL seems like a completely horrid idea, both from the
complexity and performance standpoints.  What about using the existing
syscache logic to re-derive inval information from watching the update
operations?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE command

2008-10-30 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> If I read this right, SQL would allow writing
> TABLE foo;

> as a top-level command, equivalent to SELECT * FROM foo; (see production 
> ).  It can be used whereever a SELECT or VALUES can be used.

> This is probably about as useless as some of my other recent patches, 
> but the implementation is simple (see attachment), so we could add it. 
> Comments?

Considering how ugly it was to fit top-level VALUES into our
documentation, I'm not really looking forward to this one.  I'd vote no,
at least until such time as we see some field demand for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pre-MED

2008-10-30 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> On Wed, Oct 29, 2008 at 10:23:36PM -0400, Tom Lane wrote:
>> I would argue that it's already designed wrong if there's need for
>> PL-specific implementation effort.

> I'm not sure how else to do this.  The current implementation returns
> char *, which doesn't translate uniformly into the PLs.

Surely they all have a way to call a SQL function that returns text.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements

2008-10-30 Thread ITAGAKI Takahiro

Martin Pihlak <[EMAIL PROTECTED]> wrote:

> Attached is a patch that adds sourceText to QueryDesc.

It worked fine surprisingly :) . Internal and C functions don't use
executor, so we can ignore trivial function calls (ex. operators).
Furthermore, it is ok if QueryDesc doesn't have queryText
because the result is counted up in the upper statement.

> But is the idea of extending QueryDesc generally acceptable? Is it OK
> to make a copy of the query string?

The only thing I'm worried about is that QueryDesc lives longer than
its queryText. Can I assume it never occurs?

> I tested with modified pg_stat_statements (removed toplevel checks),

Stack of counters would be better. The attached is modified to do so.
It might be worth thinking about adding counters that are equivalent
to total_time and self_time in in pg_stat_user_functions.

=# CREATE OR REPLACE FUNCTION plfn(integer) RETURNS bigint AS
$$
DECLARE
  i bigint;
BEGIN
  SELECT count(*) INTO i FROM pg_class;
  SELECT count(*) INTO i FROM pg_class;
  SELECT count(*) INTO i FROM generate_series(1, $1);
  RETURN i;
END;
$$
LANGUAGE plpgsql;

=# SELECT sum(plfn(1)) FROM generate_series(1, 100);
=# SELECT query, calls, total_time, rows FROM pg_stat_statements;

 query | calls | total_time | 
rows
---+---++--
 SELECT sum(plfn(1)) FROM generate_series(1, 100); | 1 |428 |   
 1
 SELECT count(*) FROM pg_class |   200 | 32 |  
200
 SELECT count(*) FROM generate_series(1,  $1 ) |   100 |309 |  
100
(3 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



pg_stat_statements.c
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updating FSM on recovery

2008-10-30 Thread Gregory Stark

Heikki Linnakangas <[EMAIL PROTECTED]> writes:

> Hmm. I think an enum is better than a bitmask here. At the moment, we need
> three different modes of operation:
> 1. Read the page as usual, throw an error on corrupted page (ReadBuffer())
> 2. Read the page, zero page on corruption (this is new)

Is this new? Would it make sense for zero_damaged_pages to use this? Perhaps
the enum should have an option to error on damaged pages, warn and zero
damaged pages, or just zero damaged pages.

We might also want different behaviour for pages for which the crc doesn't
match versus pages that have nonsensical page headers.

> 3. Don't read the page from disk, just allocate a buffer. (ReadOrZeroBuffer())

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Question about GetAttributeByNum(Name) ExecQual.c

2008-10-30 Thread Zdenek Kotala
GetAttributeByNum has first parameter as HeapTupleHeader,  but most functions 
use Datum and after that they call DatumGetHeapTupleHeader. The difference is 
that DatumGetHeapTupleHeader performs detoast on tuple(row type).


Is it intention do not detoast in these functions or it is a bug?

thanks Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-30 Thread Hans-Jürgen Schönig

Bruce Momjian wrote:

Greg Stark wrote:
  
I couldn't get async I/O to work on Linux. That is it "worked" but  
performed the same as reading one block at a time. On solaris the  
situation is reversed.


In what way is fadvise a kludge?



I think he is saying AIO gives us more flexibility, but I am unsure we
need it.
  



absolutely.
posix_fadvise is easy to implement and i would assume that it takes away 
a lot of "guessing" on the OS internals side.
the database usually knows that it is gonna read a lot of data in a 
certain way and it cannot be a bad idea to give the kernel a hint here.
especially synchronized seq scans and so on are real winners here as you 
stop confusing the kernel with XX concurrent readers on the same file.

this can also be an issue with some controller firmwares and so on.

   many thanks,

  hans


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: Caches and Locks

2008-10-30 Thread Simon Riggs

On Tue, 2008-10-21 at 15:06 +0100, Simon Riggs wrote:

> We can't augment the commit/abort messages because
> we must cater for non-transactional invalidations also, plus commit
> xlrecs are already complex enough. So we log invalidations prior to
> commit, queue them and then trigger the send at commit (if it
> happens).

Augmenting the commit messages seems like the better approach. It allows
invalidation messages to be fired as they are read off the xlrec. Still
need the additional message type to handle nontransactional
invalidation. There are other messages possibly more complex than this
already.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updating FSM on recovery

2008-10-30 Thread Simon Riggs

On Thu, 2008-10-30 at 09:57 +, Simon Riggs wrote:
> On Thu, 2008-10-30 at 10:40 +0200, Heikki Linnakangas wrote:
> 
> > So, attached is a patch using an enum. Barring objections, I'll commit
> > this.
> 
> I probably agree with the changes from reading your post, but I'd ask
> that you hang fire on committing this for a few days.

Best thing from here is for me to just freeze my tree for next few days.
It will make my submission a few days out of date, but we can fix that
up fairly quickly.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updating FSM on recovery

2008-10-30 Thread Simon Riggs

On Thu, 2008-10-30 at 10:40 +0200, Heikki Linnakangas wrote:

> So, attached is a patch using an enum. Barring objections, I'll commit
> this.

I probably agree with the changes from reading your post, but I'd ask
that you hang fire on committing this for a few days.

It's just going to prevent Koichi and myself from submitting clean
patches on F-Day, or it will cause us to spend time on rework before
we've even submitted the patch. I'd like to avoid the pileup for now,
though don't have any problem with the rework after that point.

Thanks,

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updating FSM on recovery

2008-10-30 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
The ReadBuffer() interface is already pretty complex, with all the 
different variants. We should probably keep the good old ReadBuffer() 
the same, for the sake of simplicity in the callers, but try to reduce 
the number of other variatns.


Indeed.  Did you see the discussion about the similarly-too-complex
heap_insert API a couple days ago in connection with bulk-write
scenarios?  The conclusion there was to try to shift stuff into a
bitmask options argument, in hopes that future additions might not
require touching every caller.  Can we do it similarly here?


Hmm. I think an enum is better than a bitmask here. At the moment, we 
need three different modes of operation:

1. Read the page as usual, throw an error on corrupted page (ReadBuffer())
2. Read the page, zero page on corruption (this is new)
3. Don't read the page from disk, just allocate a buffer. 
(ReadOrZeroBuffer())


If we turned this into a bitmask, what would the bits be? Perhaps:

DONT_READ /* don't read the page from disk, just allocate buffer */
NO_ERROR_ON_CORRUPTION /* don't throw an error if page is corrupt */

With two bits, there's four different combinations. I don't think the 
DONT_READ | NO_ERROR_ON_CORRUPTION combination makes much sense. Also, 
negative arguments like that can be confusing, but if we inverted the 
meanings, most callers would have to pass both flags to get the normal 
behavior.


Looking into the crystal ball, there's two forthcoming features to the 
interface that I can see:
1. Pin the buffer if the page is in buffer cache. If it's not, do 
nothing. This is what Simon proposed for the B-tree vacuum interlocking, 
and I can see that it might be useful elsewhere as well.
2. The posix_fadvise() thing. Or async I/O. It looks like it's going to 
be a separate function you call before ReadBuffer(), but it could also 
be implemented as a new mode to ReadBuffer() that just allocates a 
buffer, issues a posix_fadvise(), and returns. You would then pass the 
Buffer to another function to finish the read and make the contents of 
the buffer valid.


Neither of these fits too well with the bitmask. Neither would make 
sense with DONT_READ or NO_ERROR_ON_CORRUPTION.


So, attached is a patch using an enum. Barring objections, I'll commit this.

There is a conflict with Simon's hot standby patch, though. Simon's 
patch adds yet another argument to XLogReadBufferWithFork(), to indicate 
whether a normal exclusive lock or a cleanup lock is taken on the 
buffer. I'm inclined to change the interface of XLogReadBufferExtended 
(as it's now called, after this patch) so that it only pins the page, 
and leave the locking to the caller.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** src/backend/access/gin/ginvacuum.c
--- src/backend/access/gin/ginvacuum.c
***
*** 155,164  xlogVacuumPage(Relation index, Buffer buffer)
  static bool
  ginVacuumPostingTreeLeaves(GinVacuumState *gvs, BlockNumber blkno, bool isRoot, Buffer *rootBuffer)
  {
! 	Buffer		buffer = ReadBufferWithStrategy(gvs->index, blkno, gvs->strategy);
! 	Page		page = BufferGetPage(buffer);
  	bool		hasVoidPage = FALSE;
  
  	/*
  	 * We should be sure that we don't concurrent with inserts, insert process
  	 * never release root page until end (but it can unlock it and lock
--- 155,168 
  static bool
  ginVacuumPostingTreeLeaves(GinVacuumState *gvs, BlockNumber blkno, bool isRoot, Buffer *rootBuffer)
  {
! 	Buffer		buffer;
! 	Page		page;
  	bool		hasVoidPage = FALSE;
  
+ 	buffer = ReadBufferExtended(gvs->index, MAIN_FORKNUM, blkno,
+ RBM_NORMAL, gvs->strategy);
+ 	page = BufferGetPage(buffer);
+ 
  	/*
  	 * We should be sure that we don't concurrent with inserts, insert process
  	 * never release root page until end (but it can unlock it and lock
***
*** 241,253  static void
  ginDeletePage(GinVacuumState *gvs, BlockNumber deleteBlkno, BlockNumber leftBlkno,
  			  BlockNumber parentBlkno, OffsetNumber myoff, bool isParentRoot)
  {
! 	Buffer		dBuffer = ReadBufferWithStrategy(gvs->index, deleteBlkno, gvs->strategy);
! 	Buffer		lBuffer = (leftBlkno == InvalidBlockNumber) ?
! 	InvalidBuffer : ReadBufferWithStrategy(gvs->index, leftBlkno, gvs->strategy);
! 	Buffer		pBuffer = ReadBufferWithStrategy(gvs->index, parentBlkno, gvs->strategy);
  	Page		page,
  parentPage;
  
  	LockBuffer(dBuffer, GIN_EXCLUSIVE);
  	if (!isParentRoot)			/* parent is already locked by
   * LockBufferForCleanup() */
--- 245,268 
  ginDeletePage(GinVacuumState *gvs, BlockNumber deleteBlkno, BlockNumber leftBlkno,
  			  BlockNumber parentBlkno, OffsetNumber myoff, bool isParentRoot)
  {
! 	Buffer		dBuffer;
! 	Buffer		lBuffer;
! 	Buffer		pBuffer;
  	Page		page,
  parentPage;
  
+ 	dBuffer = ReadBufferExtended(gvs->index, MAIN_FORKNUM, deleteBlkno,
+  RBM_NORMAL, gvs->strategy);
+ 
+ 	if (leftBlkno != InvalidBlockNumber)
+ 		lBuffer = R

Re: [HACKERS] SQL/MED compatible connection manager

2008-10-30 Thread Martin Pihlak
> I have put together a draft that describes a possible implementation:
> http://wiki.postgresql.org/wiki/SqlMedConnectionManager
> 

I'll update this with an experimental patch. This implements:
* System catalogs for FDW, SERVER and USER MAPPING
* regserver data type for servers (convert from text to oid etc).
* FDW placeholders as shared libraries -- currently provides dummy
  and pgsql wrappers.
* Connection lookup via FDW.
* SQL interface to the lookup function.

There is an example of how all this works at:
http://wiki.postgresql.org/wiki/SqlMedConnectionManager#Current_implementation

I was hoping to get the internal features done by the start of November
Commitfest. But right now this seems unlikely. Nevertheless, I'm all ears
for suggestions and criticism.

PS. Jonah, does this somehow coincide with your approach to FDW-s?

regards,
Martin



connection-manager.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TABLE command

2008-10-30 Thread Peter Eisentraut

If I read this right, SQL would allow writing

TABLE foo;

as a top-level command, equivalent to SELECT * FROM foo; (see production 
).  It can be used whereever a SELECT or VALUES can be used.


This is probably about as useless as some of my other recent patches, 
but the implementation is simple (see attachment), so we could add it. 
Comments?
diff -ur ../cvs-pgsql/src/backend/parser/gram.y ./src/backend/parser/gram.y
--- ../cvs-pgsql/src/backend/parser/gram.y  2008-10-29 13:37:47.0 
+0200
+++ ./src/backend/parser/gram.y 2008-10-29 16:49:09.0 +0200
@@ -6416,6 +6416,25 @@
$$ = (Node *)n;
}
| values_clause 
{ $$ = $1; }
+   | TABLE table_ref
+   {
+   /* same as SELECT * FROM table_ref */
+   ColumnRef *cr = makeNode(ColumnRef);
+   ResTarget *rt = makeNode(ResTarget);
+   SelectStmt *n = makeNode(SelectStmt);
+
+   cr->fields = 
list_make1(makeNode(A_Star));
+   cr->location = -1;
+
+   rt->name = NULL;
+   rt->indirection = NIL;
+   rt->val = (Node *)cr;
+   rt->location = -1;
+
+   n->targetList = list_make1(rt);
+   n->fromClause = list_make1($2);
+   $$ = (Node *)n;
+   }
| select_clause UNION opt_all select_clause
{
$$ = makeSetOp(SETOP_UNION, $3, $1, $4);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers