date:20080401

[HACKERS] notify with payload (pgkill, notify)

2008-04-01 Thread James Mansion


Is the intent to replace most uses of (pg)kill with a general
purpose messaging system between the processes, or
(just) to address notify per se?

(Presumably with 'fire-and-forget' and also rpc
semantics?  And pub-sub? And some sort of
write to an fd protected by an atomic flag to
elide multiple writes when the process hasn't woken
and acknowledged the ATTN yet?)

If pgkill is not used for signalling, could this reduce the reliance on 
signals

(except for trying to kill off processes) to the point
where ot becomes much less scary to link to libraries
that use signals themselves and/or threaded runtimes?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-01 Thread Bruce Momjian


There isn't enough agreement to move some things from pgcrypto to the
core so this thread is being removed from the patch queue.

---

Bruce Momjian wrote:
> 
> I am not thrilled about moving _some_ of pgcrypto into the backend ---
> pgcrypto right now seems well designed and if we pull part of it out it
> seems it will be less clear than what we have now.  Perhaps we just need
> to document that md5() isn't for general use and some function in
> pgcrypto should be used instead?
> 
> ---
> 
> Marko Kreen wrote:
> > On 1/21/08, Tom Lane <[EMAIL PROTECTED]> wrote:
> > > > MD5 is broken in the sense that you can create two or more meaningful
> > > > documents with the same hash.
> > >
> > > Note that this isn't actually very interesting for the purpose for
> > > which the md5() function was put into core: namely, hashing passwords
> > > before they are stored in pg_authid.
> > 
> > Note: this was bad idea.  The function that should have been
> > added to core would be pg_password_hash(username, password).
> > 
> > Adding md5() lessens incentive to install pgcrypto or push/accept
> > digest() into core and gives impression there will be sha1(), etc
> > in the future.
> > 
> > Now users who want to store passwords in database (the most
> > popular usage) will probably go with md5() without bothering
> > with pgcrypto.  They probably see "Postgres itself uses MD5 too",
> > without realizing their situation is totally different from
> > pg_authid one.
> > 
> > It's like we have solution that is ACID-compliant 99% of the time in core,
> > so why bother with 100% one.
> > 
> > -- 
> > marko
> > 
> > ---(end of broadcast)---
> > TIP 4: Have you searched our list archives?
> > 
> >http://archives.postgresql.org
> 
> -- 
>   Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
>   EnterpriseDB http://postgres.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Greg Smith


On Tue, 1 Apr 2008, Andrew Dunstan wrote:

I don't know if this has come up before exactly, but is it possible that we 
could get a performance gain from building multiple indexes from a single 
sequential pass over the base table?


It pops up regularly, you might even have walked by a discussion of this 
idea with myself, Jan, and Jignesh over the weekend.  Jignesh pointed out 
that index creation was a major drag on his PostgreSQL benchmarking 
operations and I've run into that myself.  I have a large dataset and 
creating a simple index takes around 70% of the time it takes to load the 
data in the first place, his multiple index tables took multiples of load 
time to index.  Considering that the bulk load isn't exactly speedy either 
this gives you an idea how much room for improvement there is.


The idea we were bouncing around went a step past that and considered 
this:  if you have good statistics on a table, and you have a sample set 
of queries you want to execute against it, how would you use that 
information to plan what indexes should be created?  Needing to be able to 
create multiple indexes at once efficiently was an implementation detail 
to pull that off.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] varadic patch

2008-04-01 Thread Bruce Momjian


Because of this:

> variadic function, named params exist only as WIP and I see it for
> next commit fest. I'll send new version in next months.

This has been saved for the next commit-fest:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Pavel Stehule wrote:
> Hello
> 
> there is some noises about my patches :(
> 
> I sent EXECUTE USING - it's important (against to SQL injection and
> faster dynamic SQL), this patch is linger time in queue.
> 
> 
> Regards
> Pavel Stehule
> 
> 
> 
> On 25/03/2008, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> > Ok, AFAICT it is complete:
> >
> >  http://wiki.postgresql.org/wiki/CommitFest:March
> >
> >  It is a reasonably short page, so it's really easy to search for things
> >  you might want to work on for this commit fest.
> >
> >  I also added the patches submitted on March 2008 to the May commitfest
> >  page.
> >
> >  Patch submitters: please have a look at the current commitfest page and
> >  check for possible nuisances.
> >
> >  --
> >  Alvaro Herrerahttp://www.CommandPrompt.com/
> >  The PostgreSQL Company - Command Prompt, Inc.
> >
> >
> >  --
> >  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> >  To make changes to your subscription:
> >  http://www.postgresql.org/mailpref/pgsql-hackers
> >
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Update to patch queue web page

2008-04-01 Thread Bruce Momjian

I have updated the patch queue web page so that I can move related
threads into a single thread, and have done so for the remaining emails.

I have also improved the appearance using CSS markup.

The new "Append to Message-ID" currently is only possible for me but I
can give other access as needed.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] column level privileges

2008-04-01 Thread Andrew Dunstan




The earliest will be 8.4, which is many many months away.

It should be possible to produce a patch for 8.3 if you're interested.

cheers

andrew

sanjay sharma wrote:

Hello Andrew,
 
When do you expect this patch to go in production and available for 
public use? I would keep an eye for its release.
 
Sanjay Sharma


> Date: Tue, 1 Apr 2008 18:40:24 -0400
> From: [EMAIL PROTECTED]
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] column level privileges
>
>
> Apologies if this gets duplicated - original seems to have been dropped
> due to patch size - this time I am sending it gzipped.
>
> cheers
>
> andrew
>
>  Original Message 
> Subject: column level privileges
> Date: Tue, 01 Apr 2008 08:32:25 -0400
> From: Andrew Dunstan <[EMAIL PROTECTED]>
> To: Patches (PostgreSQL) <[EMAIL PROTECTED]>
>
>
>
> This patch by Golden Lui was his work for the last Google SoC. I was 
his

> mentor for the project. I have just realised that he didn't send his
> final patch to the list.
>
> I guess it's too late for the current commit-fest, but it really needs
> to go on a patch queue (my memory on this was jogged by Tom's recent
> mention of $Subject).
>
> I'm going to see how much bitrot there is and see what changes are
> necessary to get it to apply.
>
> cheers
>
> andrew
>
>
> -
> Here is a README for the whole patch.
>
> According to the SQL92 standard, there are four levels in the privilege
> hierarchy, i.e. database, tablespace, table, and column. Most 
commercial

> DBMSs support all the levels, but column-level privilege is hitherto
> unaddressed in the PostgreSQL, and this patch try to implement it.
>
> What this patch have done:
> 1. The execution of GRANT/REVOKE for column privileges. Now only
> INSERT/UPDATE/REFERENCES privileges are supported, as SQL92 specified.
> SELECT privilege is now not supported. This part includes:
> 1.1 Add a column named 'attrel' in pg_attribute catalog to store
> column privileges. Now all column privileges are stored, no matter
> whether they could be implied from table-level privilege.
> 1.2 Parser for the new kind of GRANT/REVOKE commands.
> 1.3 Execution of GRANT/REVOKE for column privileges. Corresponding
> column privileges will be added/removed automatically if no column is
> specified, as SQL standard specified.
> 2. Column-level privilege check.
> Now for UPDATE/INSERT/REFERENCES privilege, privilege check will be
> done ONLY on column level. Table-level privilege check was done in the
> function InitPlan. Now in this patch, these three kind of privilege are
> checked during the parse phase.
> 2.1 For UPDATE/INSERT commands. Privilege check is done in the
> function transformUpdateStmt/transformInsertStmt.
> 2.2 For REFERENCES, privilege check is done in the function
> ATAddForeignKeyConstraint. This function will be called whenever a
> foreign key constraint is added, like create table, alter table, etc.
> 2.3 For COPY command, INSERT privilege is check in the function
> DoCopy. SELECT command is checked in DoCopy too.
> 3. While adding a new column to a table using ALTER TABLE command, set
> appropriate privilege for the new column according to privilege already
> granted on the table.
> 4. Allow pg_dump and pg_dumpall to dump in/out column privileges.
> 5. Add a column named objsubid in pg_shdepend catalog to record ACL
> dependencies between column and roles.
> 6. modify the grammar of ECPG to support column level privileges.
> 7. change psql's \z (\dp) command to support listing column privileges
> for tables and views. If \z(\dp) is run with a pattern, column
> privileges are listed after table level privileges.
> 8. Regression test for column-level privileges. I changed both
> privileges.sql and expected/privileges.out, so regression check is now
> all passed.
>
> Best wishes
> Dong
> --
> Guodong Liu
> Database Lab, School of EECS, Peking University
> Room 314, Building 42, Peking University, Beijing, 100871, China
>
>



Exclusive Marriage Proposals! Find UR life partner at Shaadi.com Try 
it! 


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] column level privileges

2008-04-01 Thread sanjay sharma


Hello Andrew,
 
When do you expect this patch to go in production and available for public use? 
I would keep an eye for its release.
 
Sanjay Sharma> Date: Tue, 1 Apr 2008 18:40:24 -0400> From: [EMAIL PROTECTED]> 
To: pgsql-hackers@postgresql.org> Subject: [HACKERS] column level privileges> > 
> Apologies if this gets duplicated - original seems to have been dropped > due 
to patch size - this time I am sending it gzipped.> > cheers> > andrew> > 
 Original Message > Subject: column level privileges> Date: 
Tue, 01 Apr 2008 08:32:25 -0400> From: Andrew Dunstan <[EMAIL PROTECTED]>> To: 
Patches (PostgreSQL) <[EMAIL PROTECTED]>> > > > This patch by Golden Lui was 
his work for the last Google SoC. I was his > mentor for the project. I have 
just realised that he didn't send his > final patch to the list.> > I guess 
it's too late for the current commit-fest, but it really needs > to go on a 
patch queue (my memory on this was jogged by Tom's recent > mention of 
$Subject).> > I'm going to see how much bitrot there is and see what changes 
are > necessary to get it to apply.> > cheers> > andrew> > > -> 
Here is a README for the whole patch.> > According to the SQL92 standard, there 
are four levels in the privilege > hierarchy, i.e. database, tablespace, table, 
and column. Most commercial > DBMSs support all the levels, but column-level 
privilege is hitherto > unaddressed in the PostgreSQL, and this patch try to 
implement it.> > What this patch have done:> 1. The execution of GRANT/REVOKE 
for column privileges. Now only > INSERT/UPDATE/REFERENCES privileges are 
supported, as SQL92 specified. > SELECT privilege is now not supported. This 
part includes:> 1.1 Add a column named 'attrel' in pg_attribute catalog to 
store > column privileges. Now all column privileges are stored, no matter > 
whether they could be implied from table-level privilege.> 1.2 Parser for the 
new kind of GRANT/REVOKE commands.> 1.3 Execution of GRANT/REVOKE for column 
privileges. Corresponding > column privileges will be added/removed 
automatically if no column is > specified, as SQL standard specified.> 2. 
Column-level privilege check.> Now for UPDATE/INSERT/REFERENCES privilege, 
privilege check will be > done ONLY on column level. Table-level privilege 
check was done in the > function InitPlan. Now in this patch, these three kind 
of privilege are > checked during the parse phase.> 2.1 For UPDATE/INSERT 
commands. Privilege check is done in the > function 
transformUpdateStmt/transformInsertStmt.> 2.2 For REFERENCES, privilege check 
is done in the function > ATAddForeignKeyConstraint. This function will be 
called whenever a > foreign key constraint is added, like create table, alter 
table, etc.> 2.3 For COPY command, INSERT privilege is check in the function > 
DoCopy. SELECT command is checked in DoCopy too.> 3. While adding a new column 
to a table using ALTER TABLE command, set > appropriate privilege for the new 
column according to privilege already > granted on the table.> 4. Allow pg_dump 
and pg_dumpall to dump in/out column privileges.> 5. Add a column named 
objsubid in pg_shdepend catalog to record ACL > dependencies between column and 
roles.> 6. modify the grammar of ECPG to support column level privileges.> 7. 
change psql's \z (\dp) command to support listing column privileges > for 
tables and views. If \z(\dp) is run with a pattern, column > privileges are 
listed after table level privileges.> 8. Regression test for column-level 
privileges. I changed both > privileges.sql and expected/privileges.out, so 
regression check is now > all passed.> > Best wishes> Dong> -- > Guodong Liu> 
Database Lab, School of EECS, Peking University> Room 314, Building 42, Peking 
University, Beijing, 100871, China> > 
_
Technology : Catch up on updates on the latest Gadgets, Reviews, Gaming and 
Tips to use technology etc.
http://computing.in.msn.com/

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread ITAGAKI Takahiro

Toru SHIMOGAKI <[EMAIL PROTECTED]> wrote:

> Andrew Dunstan wrote:
> > we could get a performance gain from building multiple indexes from a 
> > single sequential pass over the base table?
> 
> It is already implemented in pg_bulkload 
> (http://pgbulkload.projects.postgresql.org/).

I think there are two ways to implement multiple index creation.
  1. Add multiple indexes AFTER data loading.
  2. Define multiple indexes BEFORE data loading.

pg_bulkload uses the 2nd way, but the TODO item seems to target
the 1st, right? -- Both are useful, though.

| Allow multiple indexes to be created concurrently, ideally via a
| single heap scan, and have pg_restore use it

In either case, we probably need to renovate ambuild interface.
I'm thinking to reverse the control of heap sequential scans;
Seq scan is done in ambuild for now, but it will be controlled in
an external loop in the new method.

Define a new IndexBulder interface, something like:
interface IndexBuilder
{
   addTuple(IndexTuple tuple);
   finishBuild();
}
and make ambuild() to return an IndexBuilder instance implemented in each AM.

However, it cannot use multiple CPUs if indexes are built in one process.
A less granular method might be better for Postgres, like synchronized scans,
as already pointed out.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Guillaume Smet

On Wed, Apr 2, 2008 at 2:53 AM, Dave Cramer <[EMAIL PROTECTED]> wrote:
>  I for one would be very interested in the JDBCBench code.

OK, I didn't make anything fancy, I just fixed the problem I
encountered when profiling Sequoia (I mostly used it as an injector).

I'll post the code tomorrow if I can find it somewhere (I lost a
couple of disks and I don't remember the box I used to run the tests).

-- 
Guillaume

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New boxes available for QA

2008-04-01 Thread Guillaume Smet

On Tue, Apr 1, 2008 at 3:29 PM, Stephen Frost <[EMAIL PROTECTED]> wrote:
>  I'm almost done scripting up everything to load the TIGER/Line
>  Shapefiles from the US Census into PostgreSQL/PostGIS.  Once it's done
>  and working I would be happy to provide it to whomever asks, and it
>  might be an interesting data set to load/query and look at benchmarks
>  with.  There's alot of GIST index creation, as well as other indexes
>  like soundex(), and I'm planning to use partitioning of some sort for
>  the geocoder.  We could, for example, come up with some set of arbitrary
>  addresses to geocode and see what the performance of that is.
>
>  It's just a thought, and it's a large/"real" data set to play with.

I must admit that the first step I want to be achieved is to have the
most simple regression tests running on a daily basis. A real database
with advanced features can be very interesting for the future.

I'm not sure loading the full database will provide useful results on
this hardware but we can always work on a subset of it.

-- 
Guillaume

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New boxes available for QA

2008-04-01 Thread Guillaume Smet

On Wed, Apr 2, 2008 at 1:53 AM, Greg Smith <[EMAIL PROTECTED]> wrote:
>  The pgbench-tools utilities I was working on at one point anticipated this
>  sort of test starting one day.  You can't really get useful results out of
>  pgbench without running it enough times that you get average or median
>  values.  I dump everything into a results database which can be separated
>  from the databases used for running the test, and then it's easy to
>  compare day to day aggregate results across different query types.

I already used your pgbench tools but I just used the ability to draw
graphs with gnuplot, I didn't test the database thing.

>  I haven't had a reason to work on that recently, but if you've got a
>  semi-public box ready for benchmarks now I do.  Won't be able to run any
>  serious benchmarks on the systems you described, but should be great for
>  detecting basic regressions and testing less popular compile-time options
>  as you describe.

Yeah, that's exactly what they are for.

>  As far as the other more powerful machines you mentioned go, would need to
>  know a bit more about the disks and disk controller in there to comment
>  about whether those are worth the trouble to integrate.  The big missing
>  piece of community hardware that remains elusive would be a system with
>  >=4 cores, >=8GB RAM, and >=8 disks with a usable write-caching controller
>  in it.

All the other boxes are Dell boxes (1750/1850/2950/6850) with PERC 4
or 5 depending on the servers. Two of them have external attachments
to a disk array but it's an old one with 2 separated arrays (4 disks +
5 disks IIRC).
They aren't big beasts but I think they can be useful to hackers who
don't have any hardware fully available and also run more serious
continuous tests than the other ones.

I'll post the specs of the servers that may be fully available for
community purposes tomorrow.

-- 
Guillaume

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Dave Cramer


Guillaume,

I for one would be very interested in the JDBCBench code.

Dave
On 1-Apr-08, at 8:35 PM, Guillaume Smet wrote:

On Wed, Apr 2, 2008 at 2:05 AM, Greg Smith <[EMAIL PROTECTED]>  
wrote:

I'm not sure if all of those changes are net positive for PostgreSQL
though, they weren't last time I played with this.


I fixed most of the bugs of JDBCBench I found when I benchmarked
Sequoia a long time ago. Totally forgot about it. I'll see if I can
find the sources somewhere.

--
Guillaume

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Several tags around PostgreSQL 7.1 broken

2008-04-01 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

- --On Tuesday, April 01, 2008 14:06:09 -0400 Tom Lane <[EMAIL PROTECTED]> 
wrote:

> Peter Eisentraut <[EMAIL PROTECTED]> writes:
>> In the meantime, does anyone have more information about how this came about?
>
> Marc's always done both the tagging and the tarball-making, so you'd
> have to ask him about that.  I believe he's made it more scripted over
> the years, so this might reflect a manual foulup that (hopefully) is no
> longer possible.

Ya, I'll go with that (considering 7.1 was back in 2001 ... ) ... but, from the 
way Peter describes it (taging partially checked out code), I'm not 100% how 
its possible to 'foul up' ... a tag operation is:

cvs -q update -APd .
cvs -q tag REL7_1 .

unless its a sub-tagging, which would have:

cvs -q update -rREL7_1_STABLE -Pd .
cvs -q tag REL7_1_1 .

And since I don't do the update until things are "quiet" (generally when Tom 
has finished his last commit before release), I'm not sure how I could have 
gotten a 'partial checkout' ...

> +1 for adjusting the tags in CVS to match what we actually shipped.

Agreed ... but, stupid question here ... if our tags are wrong in CVS, are the 
7.1.x releases themselves wrong too?  When I do a release tarball, I run:

cvs -q export -rREL7_1_1 pgsql

so, if the tags are wrong, then all of those releases are wrong too, since they 
are based on the tag ...

- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFH8tX24QvfyHIvDvMRAndoAJ9KA86BZl21zLb3rie9ynlmDL7BHQCfdtjB
VrYLsml4H+ppnXvC26ywKTU=
=RWHE
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Guillaume Smet

On Wed, Apr 2, 2008 at 2:05 AM, Greg Smith <[EMAIL PROTECTED]> wrote:
>  I'm not sure if all of those changes are net positive for PostgreSQL
>  though, they weren't last time I played with this.

I fixed most of the bugs of JDBCBench I found when I benchmarked
Sequoia a long time ago. Totally forgot about it. I'll see if I can
find the sources somewhere.

-- 
Guillaume

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Bruce Momjian

Andrew Dunstan wrote:
> 
>  From the "idle thoughts in the middle of the night" department:
> 
> I don't know if this has come up before exactly, but is it possible that 
> we could get a performance gain from building multiple indexes from a 
> single sequential pass over the base table? If so, that would probably 
> give us  a potential performance improvement in pg_restore quite apart 
> from the projected improvement to be got from running several steps in 
> parallel processes. The grammar might look a bit ugly, but I'm sure we 
> could finesse that.

TODO already has:

* Allow multiple indexes to be created concurrently, ideally via a
  single heap scan, and have pg_restore use it

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New boxes available for QA

2008-04-01 Thread Stephen Frost

* Greg Smith ([EMAIL PROTECTED]) wrote:
> >=4 cores, >=8GB RAM, and >=8 disks with a usable write-caching controller
> in it.

hrmmm.  So a DL385G2, dual-proc/dual-core with 16GB of ram and 8 SAS
disks with a Smart Array P800 w/ 512MB of write cache would be helpful?  

I've got quite a few such machines, along with larger DL585s.  I can't
make one externally available immediately but I could set one up to do
benchmark runs and to dump the results to a public site.  What I don't
have atm is alot of time though, of course.  Are there scripts and
whatnot to get such a set up going quickly?

I'll also investigate actually making one available to the community.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [JDBC] Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Greg Smith


On Tue, 1 Apr 2008, Guillaume Smet wrote:


A good answer is probably to plan optional JDBC benchmarks in the
benchfarm design - not all people want to run Java on their boxes but
we have servers of our own to do so.


The original pgbench was actually based on an older test named JDBCbench. 
That code is kind of old and buggy at this point.  But with some care and 
cleanup it's possible to benchmark not only relative Java performance with 
it, but you can compare it with pgbench running the same queries on the 
same tables to see how much overhead going through Java is adding.


Original code at http://mmmysql.sourceforge.net/performance/ , there's 
also some improved versions at 
http://developer.mimer.com/features/feature_16.htm


I'm not sure if all of those changes are net positive for PostgreSQL 
though, they weren't last time I played with this.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New boxes available for QA

2008-04-01 Thread Greg Smith


On Tue, 1 Apr 2008, Guillaume Smet wrote:

I wonder if it's not worth it to have a very simple thing already 
reporting results as the development cycle for 8.4 has already started 
(perhaps several pgbench unit tests testing various type of queries with 
a daily tree)


The pgbench-tools utilities I was working on at one point anticipated this 
sort of test starting one day.  You can't really get useful results out of 
pgbench without running it enough times that you get average or median 
values.  I dump everything into a results database which can be separated 
from the databases used for running the test, and then it's easy to 
compare day to day aggregate results across different query types.


I haven't had a reason to work on that recently, but if you've got a 
semi-public box ready for benchmarks now I do.  Won't be able to run any 
serious benchmarks on the systems you described, but should be great for 
detecting basic regressions and testing less popular compile-time options 
as you describe.


As far as the other more powerful machines you mentioned go, would need to 
know a bit more about the disks and disk controller in there to comment 
about whether those are worth the trouble to integrate.  The big missing 
piece of community hardware that remains elusive would be a system with

=4 cores, >=8GB RAM, and >=8 disks with a usable write-caching controller

in it.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Submission of Feature Request : RFC- for Implementing Transparent Data Encryption in P

2008-04-01 Thread Bruce Momjian

sanjay sharma wrote:
> 
> Hello Heikki,
> 
> Although the solution could be implemented using views and
> functions and I am implementing a reference application using
> this approach but TDE can greatly reduce the design and maintenance
> complexcity. It would also take care of data protection in
> backups and archives.  You are correct to identify that TDE may
> not provide complete data security required for data like credit
> crad details but TDE seems to be ideally suited to take care of
> data privacy issues. Major chunk of the private data is of no
> interest to hackers and criminals but needs protection only from
> casual observers. To implement a full data security infrastucture
> to protect only privacy issues seems to be overkill. Compliance
> requirement for storing private data arises from each organizations
> own declared privacy policies and statutory bodies like privacy
> commissioners and other privacy watchdogs. These standards are
> not as strict as PCI, HIPPA or Sarnabes-Oxley
> 
> Compliance with HIPPA regulation requires not only maintaining
> all records of who created and updated the record but also who
> accessed and viewed records, when and in what context.

Agreed, the bottom line is that the tools needed to do what you want are
there, but they are probably more complex to implement than in Oracle. 
We probably offer fewer canned solutions than Oracle, but more
flexibility.

--
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] column level privileges

2008-04-01 Thread Andrew Dunstan



Apologies if this gets duplicated - original seems to have been dropped 
due to patch size - this time I am sending it gzipped.


cheers

andrew

 Original Message 
Subject:column level privileges
Date:   Tue, 01 Apr 2008 08:32:25 -0400
From:   Andrew Dunstan <[EMAIL PROTECTED]>
To: Patches (PostgreSQL) <[EMAIL PROTECTED]>



This patch by Golden Lui was his work for the last Google SoC. I was his 
mentor for the project. I have just realised that he didn't send his 
final patch to the list.


I guess it's too late for the current commit-fest, but it really needs 
to go on a patch queue (my memory on this was jogged by Tom's recent 
mention of $Subject).


I'm going to see how much bitrot there is and see what changes are 
necessary to get it to apply.


cheers

andrew


-
Here is a README for the whole patch.

According to the SQL92 standard, there are four levels in the privilege 
hierarchy, i.e. database, tablespace, table, and column. Most commercial 
DBMSs support all the levels, but column-level privilege is hitherto 
unaddressed in the PostgreSQL, and this patch try to implement it.


What this patch have done:
1. The execution of GRANT/REVOKE for column privileges. Now only 
INSERT/UPDATE/REFERENCES privileges are supported, as SQL92 specified. 
SELECT privilege is now not supported. This part includes:
   1.1 Add a column named 'attrel' in pg_attribute catalog to store 
column privileges. Now all column privileges are stored, no matter 
whether they could be implied from table-level privilege.

   1.2 Parser for the new kind of GRANT/REVOKE commands.
   1.3 Execution of GRANT/REVOKE for column privileges. Corresponding 
column privileges will be added/removed automatically if no column is 
specified, as SQL standard specified.

2. Column-level privilege check.
   Now for UPDATE/INSERT/REFERENCES privilege, privilege check will be 
done ONLY on column level. Table-level privilege check was done in the 
function InitPlan. Now in this patch, these three kind of privilege are 
checked during the parse phase.
   2.1 For UPDATE/INSERT commands. Privilege check is done in the 
function transformUpdateStmt/transformInsertStmt.
   2.2 For REFERENCES, privilege check is done in the function 
ATAddForeignKeyConstraint. This function will be called whenever a 
foreign key constraint is added, like create table, alter table, etc.
   2.3 For COPY command, INSERT privilege is check in the function 
DoCopy. SELECT command is checked in DoCopy too.
3. While adding a new column to a table using ALTER TABLE command, set 
appropriate privilege for the new column according to privilege already 
granted on the table.

4. Allow pg_dump and pg_dumpall to dump in/out column privileges.
5. Add a column named objsubid in pg_shdepend catalog to record ACL 
dependencies between column and roles.

6. modify the grammar of ECPG to support column level privileges.
7. change psql's \z (\dp) command to support listing column privileges 
for tables and views. If \z(\dp) is run with a pattern, column 
privileges are listed after table level privileges.
8. Regression test for column-level privileges. I changed both 
privileges.sql and expected/privileges.out, so regression check is now 
all passed.


Best wishes
Dong
--
Guodong Liu
Database Lab, School of EECS, Peking University
Room 314, Building 42, Peking University, Beijing, 100871, China




pg_colpriv_version_0.4.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Access to Row ID information in Functions

2008-04-01 Thread Paul Ramsey

Thanks Tom,

Yes, we've discussed adding some kind of optional identity information
to the object, it remains a potential course of action.

Paul

On Tue, Apr 1, 2008 at 2:37 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Paul Ramsey" <[EMAIL PROTECTED]> writes:
>  > The "optimized" form gets cached and retrieved from a memory context.
>  > Each time the function is run within a statement it checks the cache,
>  > and sees if one of its arguments are the same as the last time around.
>  > If so, it uses the prepared version of that argument. If not, it
>  > builds a new prepared version and caches that.
>
>  > The key here is being able to check the identify of the arguments...
>  > is this argument A the same as the one we processed last time? One way
>  > is to do a memcmp.  But it seems likely that PgSQL knows exactly
>  > whether it is running a nested loop, or a literal, and could tell
>  > somehow that argument A is the same with each call.
>
>  Not really.  Certainly there's no way that that information would
>  propagate into function calls.
>
>  In the special case where your argument is a literal constant, I think
>  there is enough information available to detect that that's the case
>  (look at get_fn_expr_argtype).  But if it's not, there's no very good
>  way to know whether it's the same as last time.
>
>  Perhaps it would be worth changing your on-disk storage format to allow
>  cheaper checking?  For instance include a hash value.
>
> regards, tom lane
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-04-01 Thread Murray Cumming

On Tue, 2008-04-01 at 15:34 +, [EMAIL PROTECTED] wrote:
> I would really prefer a more loosely coupled system.

The functionality will be much the same. The implementation would be
more difficult and obscure and there would be more points of failure and
more things to configure, but it wouldn't remove much risk, I think.

Anyway, this feature is already in Postgres when it's built for MacOS X.
So this decision seems to have been made already, at least for that
platform.

-- 
[EMAIL PROTECTED]
www.murrayc.com
www.openismus.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Access to Row ID information in Functions

2008-04-01 Thread Tom Lane

"Paul Ramsey" <[EMAIL PROTECTED]> writes:
> The "optimized" form gets cached and retrieved from a memory context.
> Each time the function is run within a statement it checks the cache,
> and sees if one of its arguments are the same as the last time around.
> If so, it uses the prepared version of that argument. If not, it
> builds a new prepared version and caches that.

> The key here is being able to check the identify of the arguments...
> is this argument A the same as the one we processed last time? One way
> is to do a memcmp.  But it seems likely that PgSQL knows exactly
> whether it is running a nested loop, or a literal, and could tell
> somehow that argument A is the same with each call.

Not really.  Certainly there's no way that that information would
propagate into function calls.

In the special case where your argument is a literal constant, I think
there is enough information available to detect that that's the case
(look at get_fn_expr_argtype).  But if it's not, there's no very good
way to know whether it's the same as last time.

Perhaps it would be worth changing your on-disk storage format to allow
cheaper checking?  For instance include a hash value.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Scroll cursor oddity...

2008-04-01 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> What's implied by that but perhaps not clear is that it's easier to think of
> cursors as being *between* rows rather than *on* rows. I'm not sure the
> standard entirely adopts that model however. 

That's an interesting way of thinking about it, but I think it fails
when you consider UPDATE/DELETE WHERE CURRENT OF.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Lots and lots of strdup's (bug #4079)

2008-04-01 Thread Tom Lane

I looked into the complaint here
http://archives.postgresql.org/pgsql-bugs/2008-04/msg5.php
about 8.3 being a lot slower than 8.2.  Apparently what he's
doing is sending a whole lot of INSERT commands in a single
query string.  And, sure enough, 8.3 is a lot slower.  The
oprofile output is, um, localized:

samples  %image name   symbol name
749240   48.0999  libc-2.7.so  memcpy
392513   25.1986  libc-2.7.so  strlen
331905   21.3077  libc-2.7.so  memset
5302  0.3404  postgres AllocSetCheck
3548  0.2278  postgres AllocSetAlloc
3507  0.2251  postgres base_yyparse
3160  0.2029  postgres hash_search_with_hash_value
2068  0.1328  postgres SearchCatCache
1737  0.1115  postgres base_yylex
1606  0.1031  postgres XLogInsert

I eventually traced this down to the strdup's that were inserted into
PortalDefineQuery() in this patch:
http://archives.postgresql.org/pgsql-committers/2007-03/msg00098.php
that is, the cost differential is entirely because we started copying
a Portal's source query text into the Portal's own memory.  In the
example at hand here, the source query string is big (about a quarter
megabyte for 5 INSERTSs), and we do that copy 5 times.
Even though strdup is cheap on a per-byte basis, the O(N^2) law
eventually catches up.

Although you could argue that this example represents crummy SQL
coding style, it's still a performance regression from pre-8.3,
so I think we need to fix it.

It seems to me to be clearly necessary to copy the source text into
the Portal if the Portal is going to be long-lived ... but in the
case of simple-Query execution we know darn well that the original
string in MessageContext is going to outlive the Portal, so the
copying isn't really needed --- and this seems like the only code
path where the problem exists.  In other cases a single querystring
isn't likely (or, usually, even able) to contain more than one command
so no repeat copying will occur.

So I'm inclined to revert the decision I made in that patch that
PortalDefineQuery() should copy the strings rather than expecting
the caller to be responsible for providing a suitably long-lived
string.  We could handle this two ways:
* Put the strdup operations back into the callers that need them,
ie just revert the logic change.
* Add an additional bool parameter to PortalDefineQuery to tell it
whether the strings need to be copied.

The second option seems a bit cleaner to me but might conceivably break
third party code, if there is any that calls PortalDefineQuery.
OTOH, if anyone out there has started to depend on the assumption
that PortalDefineQuery will copy their strings, the first option
would break their code silently, which is even worse than breaking
it obviously.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Scroll cursor oddity...

2008-04-01 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Mike Aubury <[EMAIL PROTECTED]> writes:
>> ie - under postgresql it appears we've scrolled *past* the last row and need 
>> an additional fetch to get back to our last row..
>
> Why do you find that surprising?  It seems to me to be symmetrical with
> the case at the beginning of the table --- the cursor is initially
> positioned before the first row.  Why shouldn't there be a corresponding
> state where it's positioned after the last row?

What's implied by that but perhaps not clear is that it's easier to think of
cursors as being *between* rows rather than *on* rows. I'm not sure the
standard entirely adopts that model however. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-01 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Simon Riggs <[EMAIL PROTECTED]> writes:
>> On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:
>>> Please see the attached patch. One change I made is to hold the SHARE lock
>>> on the page while ANALYZE is reading tuples from it. I thought it would
>>> be a right thing to do instead of repeatedly acquiring/releasing the lock.
>
>> ANALYZE is a secondary task and so we shouldn't be speeding it up at the
>> possible expense of other primary tasks. So I think holding locks for
>> longer than minimum is not good in this case and I see no reason to make
>> the change described.
>
> I think Pavan's change is probably good.  In the first place, it's only
> a shared buffer lock and besides ANALYZE isn't going to be holding it
> long (all it's doing at this point is counting tuples and copying some
> of them into memory).  In the second place, repeated lock release and
> re-grab threatens cache line contention and a context swap storm if
> there is anyone else trying to access the page.  In the third, whether
> there's contention or not the extra acquire/release work will cost CPU
> cycles.  In the fourth, if we actually believed this was a problem we'd
> need to redesign VACUUM too, as it does the same thing.

I'm not sure all those arguments are valid (at least the first two seem
contradictory). However I'm skeptical about Simon's premise. It's not clear
any changes to ANALYZE here are at the expense of other proceses. Any cycles
saved in ANALYZE are available for those other processes after all...



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Access to Row ID information in Functions

2008-04-01 Thread Paul Ramsey

In PostGIS, we have a problem, in that spatial operations are very
costly, CPUwise.

We have hit on a nifty enhancement recently, which was to recognize
that when processing multiple rows, in joins or with literal
argouments, for most functions of the form GeometryOperation(A, B), A
(or B) tended to remain constant, while the other argument changed.
That meant that we could build an "optimized" form of the
more-constant argument (using internal index structures on the object
segments) that allows testing the changing argument much more quickly.

The "optimized" form gets cached and retrieved from a memory context.
Each time the function is run within a statement it checks the cache,
and sees if one of its arguments are the same as the last time around.
If so, it uses the prepared version of that argument. If not, it
builds a new prepared version and caches that.

The key here is being able to check the identify of the arguments...
is this argument A the same as the one we processed last time? One way
is to do a memcmp.  But it seems likely that PgSQL knows exactly
whether it is running a nested loop, or a literal, and could tell
somehow that argument A is the same with each call.

For lack of a better term, if we knew what the "row id" of each
argument was as the function was called, we could skip the memcmp
testing of geometric identity (which gets more expensive precisely at
the time our optimization gets more effective, for large arguments)
and just trust the row id as the guide of when to build and cache new
"optimized" representations.

Any guidance?

Thanks,

Paul

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Several tags around PostgreSQL 7.1 broken

2008-04-01 Thread Aidan Van Dyk

* Tom Lane <[EMAIL PROTECTED]> [080401 14:15]:
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > In the meantime, does anyone have more information about how this came 
> > about?
> 
> Marc's always done both the tagging and the tarball-making, so you'd
> have to ask him about that.  I believe he's made it more scripted over
> the years, so this might reflect a manual foulup that (hopefully) is no
> longer possible.
> 
> +1 for adjusting the tags in CVS to match what we actually shipped.
> 
>   regards, tom lane

If somebody's going to be fudging around in $CVSROOT, is it possible to
give us all a big warning, and hopefully "pause" the anoncvs/rsync sync
stuff for the duration of the "history change", so we (those of us out
here working off anoncvs/rsync) get an atomic swap of ancient history?

a.
-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] Several tags around PostgreSQL 7.1 broken

2008-04-01 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> In the meantime, does anyone have more information about how this came about?

Marc's always done both the tagging and the tarball-making, so you'd
have to ask him about that.  I believe he's made it more scripted over
the years, so this might reflect a manual foulup that (hopefully) is no
longer possible.

+1 for adjusting the tags in CVS to match what we actually shipped.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Several tags around PostgreSQL 7.1 broken

2008-04-01 Thread Aidan Van Dyk

.


* Peter Eisentraut <[EMAIL PROTECTED]> [080401 12:01]:
> I have now managed to investigate why some conversions of the PostgreSQL CVS 
> repository to other formats are having trouble or are failing.  Here, I am 
> looking at git-cvsimport in particular.
> 
> The problem appears to be that several tags around the time of PostgreSQL 7.1 
> are broken or inconsistent.  For instance, here is a piece of output of 
> cvsps:
> 
> WARNING: Invalid PatchSet 9441, Tag REL7_1:
> src/pl/plpgsql/src/mklang.sql.in:1.6=after, 
> src/pl/plpgsql/src/INSTALL:1.2=before. Treated as 'before'

I talked about this here:
http://mid.gmane.org/[EMAIL PROTECTED]

Here's a "quick hack" to cvsps that I use to ignore the problematic
tags.  Of course, I've stuck with fromcvs for my PostgreSQL git mirror
simply because even though I *can* use cvsps on PostgreSQL with that
hack, it's still loads slower than fromcvs.  Since cvsps/cvsimport is so
slow, I didn't pursue making this patch usable, and moving forward with
using cvsps/cvsimport.

[EMAIL PROTECTED]:~/build/cvsps.git$ git diff
diff --git a/cvsps.c b/cvsps.c
index 981cd78..d436591 100644
--- a/cvsps.c
+++ b/cvsps.c
@@ -2184,15 +2184,28 @@ static void parse_sym(CvsFile * file, char * 
sym)
 }
 }

+const char* skip_symbols[] =
+{
+   "REL7_1_BETA",
+   "REL7_1_BETA2",
+   "REL7_1_BETA3",
+   NULL
+};
+
 void cvs_file_add_symbol(CvsFile * file, const char * rev_str, const 
char * p_tag_str)
 {
 CvsFileRevision * rev;
 GlobalSymbol * sym;
 Tag * tag;
+int i;

 /* get a permanent storage string */
 char * tag_str = get_string(p_tag_str);

+for (i = 0; skip_symbols[i] != NULL; i++)
+   if (strcmp(tag_str, skip_symbols[i]) == 0)
+   return;
+
 debug(DEBUG_STATUS, "adding symbol to file: %s %s->%s", 
file->filename, tag_str, rev_str);
 rev = cvs_file_add_revision(file, rev_str);
 put_hash_object_ex(file->symbols, tag_str, rev, HT_NO_KEYCOPY, 
NULL, NULL);

a.

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] Cast as compound type

2008-04-01 Thread David Fetter

On Mon, Mar 31, 2008 at 07:18:43PM -0400, Korry Douglas wrote:
> David Fetter wrote:
>> I'd like to take a whack at making set-returning functions
>> returning SETOF RECORD a little more fun to use.  Let's imagine
>> that we have a table foo and a function returning SETOF RECORD that
>> can return foos.  The call might look something like:
>>
>> SELECT a, b, c
>> FROM f(ROW OF foo)
>> WHERE ...;
>>
>> This would make it much easier and less error-prone to use SETOF
>> RECORD.
>>   
> David, it sounds like you really want to declare the return type of
> the function?  In your above example, you want to say that, in this
> particular invocation, function f() returns a SETOF foo's.  Is that
> correct?

Yes.

> If you were to create function that returns a RECORD (not a SETOF RECORD), 
> you would call it like this:
>
>SELECT * FROM f() AS (column1 type1, column2 type2, column3 type3);
>
> In your case, I think you want to declare the return type using an 
> explicitly defined composite type (possibly a table row); which would imply 
> syntax such as:
>
>SELECT * FROM f() AS (foo);
>   or
>SELECT * FROM f() AS (foo.*);
>
> So, it seems like you want the syntax to look more like:
>
>SELECT a,b,c, FROM f() AS (SETOF foo);
>
> Does that make sense to you?  Your original syntax implied that the
> "ROW OF foo" was somehow related to the function arguments.
>-- Korry

I see.

Thinking a little further, it seems we could do this a little more
generally.  Here's what it could look like.

AS ( {,  =
  |
[  ] ;

 would be prepended to each column in the
output, so for a compound type foo(i int, t text, p point), AS (f foo)
would produce output columns f.i, f.t and f.p.  Typical uses for this
would be to keep a set of column names distinct.

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Several tags around PostgreSQL 7.1 broken

2008-04-01 Thread Peter Eisentraut

I have now managed to investigate why some conversions of the PostgreSQL CVS 
repository to other formats are having trouble or are failing.  Here, I am 
looking at git-cvsimport in particular.

The problem appears to be that several tags around the time of PostgreSQL 7.1 
are broken or inconsistent.  For instance, here is a piece of output of 
cvsps:

WARNING: Invalid PatchSet 9441, Tag REL7_1:
src/pl/plpgsql/src/mklang.sql.in:1.6=after, 
src/pl/plpgsql/src/INSTALL:1.2=before. Treated as 'before'

It turns out that src/pl/plpgsql/src/mklang.sql.in:1.6 and 
src/pl/plpgsql/src/INSTALL:1.2 are from the same commit ("PatchSet"), as 
determined by a common log message and timestamp, but the REL7_1 tag is on 
src/pl/plpgsql/src/mklang.sql.in:1.5 and src/pl/plpgsql/src/INSTALL:1.2.  So 
a part of the commit is before the tag and part of it is after the tag.

(The commit in question was to remove mklang.sql.in and adjust the INSTALL 
contents accordingly.)

In fact, if you check out the REL7_1 tag, you get the new INSTALL file but 
still the mklang.sql.in.  The released postgresql-7.1.tar.gz tarball is 
correct.

I guess the cause of this is that the tag was done on a partially updated 
checkout.

There are a few dozen inconsistencies like this in the tags REL7_1_BETA, 
REL7_1_BETA2, REL7_1_BETA3, REL7_1.  As a consequence of this, checkouts of 
the tags don't match the respective released tarballs.

Here are more examples:

WARNING: Invalid PatchSet 9297, Tag REL7_1:
src/backend/port/hpux/port-protos.h:1.10=after, COPYRIGHT:1.6=before. 
Treated as 'before'
WARNING: Invalid PatchSet 8906, Tag REL7_1:
doc/src/sgml/populate.sgml:2.4=after, 
doc/src/sgml/filelist.sgml:1.3=before. Treated as 'before'
WARNING: Invalid PatchSet 9371, Tag REL7_1:
doc/TODO.detail/subquery:1.3=after, doc/TODO:1.366=before. Treated 
as 'before'
WARNING: Invalid PatchSet 8815, Tag REL7_1_BETA2:
src/include/c.h:1.2=after, contrib/pgcrypto/md5.c:1.2=before. Treated 
as 'before'
etc.

I could get the conversion to run successfully if I remove the mentioned tags.  
When I find more time, I'm going to try if I can move/fix the tags instead so 
they correspond to the actual releases and the patch sets become consistent.

In the meantime, does anyone have more information about how this came about?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Debugging PostgreSQL with GDB

2008-04-01 Thread Andrew Dunstan




Manolo SupaMA wrote:

Hi.

I have problems on how to debug PostgreSQL on Linux using GDB. I made 
some changes to src/backend/utils/sort/tuplesort.c but it hangs while 
performing run formation (where my changes are).


I configured it using

./configure --prefix="/usr/local/pgsql/8.3/" CFLAGS="-O0 -DTRACE_SORT" 
--enable-debug --enable-cassert --enable-depend


and trying to debug it using

'gdb postmaster'
(which revealed to be not a very good idea) and
'gdb pg_ctl' followed by 'run -D /usr/local/psql/data'

This last choice allowed me to set a breackpoint on puttuple_common 
(contained into tuplesort.c) but then I'm unable to run 'psql'. 
http://pastebin.com/m6a97b4dd


I'm new on GDB and it's also my first postgrest patch. I just want 
some suggestion to know if I'm on the right way or not.


Thanks for your time.

PS: I suppose I'll write some related  "HowTo Debug PostgreSQL with 
GDB - Basics" for newbies like me.



The way to do this is to start the postmaster normally, run psql, and 
then attach the debugger to the backend that is talking to your psql 
session.


I would also personally advise using a gdb frontend like ddd. I am also 
told eclipse can work well.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Debugging PostgreSQL with GDB

2008-04-01 Thread korry



I have problems on how to debug PostgreSQL on Linux using GDB. I made 
some changes to src/backend/utils/sort/tuplesort.c but it hangs while 
performing run formation (where my changes are).


I configured it using

./configure --prefix="/usr/local/pgsql/8.3/" CFLAGS="-O0 -DTRACE_SORT" 
--enable-debug --enable-cassert --enable-depend


and trying to debug it using

'gdb postmaster'
(which revealed to be not a very good idea) and
'gdb pg_ctl' followed by 'run -D /usr/local/psql/data'

This last choice allowed me to set a breackpoint on puttuple_common 
(contained into tuplesort.c) but then I'm unable to run 'psql'. 
http://pastebin.com/m6a97b4dd
Run psql, find the process ID of the backend (server) process by 
executing the command "SELECT * FROM pg_backend_pid();",
then attach to that process with gdb and set a breakpoint in your new 
code; finally, go back to your psql session and execute a command that 
exercises your code.


  -- Korry

--

 Korry Douglas  <[EMAIL PROTECTED]>
 EnterpriseDBhttp://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-04-01 Thread tomas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, Apr 01, 2008 at 05:07:31PM +0200, Mathias Hasselmann wrote:
[...]
> > Personally, I'be rather scared than delighted ;-)
> 
> So in data centers you don't even trust the machines in your broadcast
> domain?

Kind of. Put it another way: never have services running you don't use.

[...]

> > Isn't there a less-intrusive option to linking a lib into each and every
> > possible server, like a config file in which to put what is to be announced?
> 
> You could directly talk to the D-Bus interface of Avahi. libavahi-client
> just is a convenience wrapper. Well, but this route will be much more
> cumbersome.

So this goes through the D-Bus. Makes kind of sense. Thanks for the
enlightenment.

> One other route is calling avahi-publish-service on startup and killing
> it on shutdown, but: avahi-publish-service really only exists for
> demonstration purposes and doesn't handle service name collisions for
> instance. I don't believe that a high-profile application like
> Postgresql should rely on low-quality hacks, like invoking educational
> demo programs.

Unelegant as it might seem -- this solution still affords a lot more
when it comes to "separation of concerns". I'm still a bit wary at the
prospect that each and every daemon evolves into a huge fuzzball
linked to all conceivable service-lets with a multitude of funny
side-effects (remember tcpwrappers?).

Of course, "you can always disable this at compile time", but let's face
it: with the predominance of binary distribs, the path of least
resistance will be to put up with whatever strange side-effects.

I would really prefer a more loosely coupled system.

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFH8lYJBcgs9XrR2kYRAmDJAJ4jWKYkhUfKEAIaZVnIbAAEqJF2AwCfS/6D
4rH9OoY7wjia7h1cuk5CjZI=
=AF1W
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Debugging PostgreSQL with GDB

2008-04-01 Thread Manolo SupaMA

Hi.

I have problems on how to debug PostgreSQL on Linux using GDB. I made some
changes to src/backend/utils/sort/tuplesort.c but it hangs while performing
run formation (where my changes are).

I configured it using

./configure --prefix="/usr/local/pgsql/8.3/" CFLAGS="-O0 -DTRACE_SORT"
--enable-debug --enable-cassert --enable-depend

and trying to debug it using

'gdb postmaster'
(which revealed to be not a very good idea) and
'gdb pg_ctl' followed by 'run -D /usr/local/psql/data'

This last choice allowed me to set a breackpoint on puttuple_common
(contained into tuplesort.c) but then I'm unable to run 'psql'.
http://pastebin.com/m6a97b4dd

I'm new on GDB and it's also my first postgrest patch. I just want some
suggestion to know if I'm on the right way or not.

Thanks for your time.

PS: I suppose I'll write some related  "HowTo Debug PostgreSQL with GDB -
Basics" for newbies like me.

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-04-01 Thread Mathias Hasselmann


Am Dienstag, den 01.04.2008, 12:02 + schrieb [EMAIL PROTECTED]:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On Tue, Apr 01, 2008 at 09:35:56AM +0200, Mathias Hasselmann wrote:
> > Am Samstag, den 29.03.2008, 12:25 + schrieb [EMAIL PROTECTED]:
> > [...]
> > > Sorry for a dumb question, but I couldn't figure that out from your
> > > references [1]..[4]: does that mean that the PostgreSQL server would
> > > "advertise itself" on the local net? Or what is the purpose of liking-in
> > > libavahi into the postmaster?
> > 
> > Yes, that's the purpose.
> > 
> > > Surely one wouldn't want this in a data center? 
> > 
> > Yes, this feature definitely targets small-office use, personal use, DB
> > developers [...]
> > Still you can tell Avahi to explicitly announce at a certain, non-local
> > domain, but this feature is not implemented by the patch. Maybe database
> > developers in large network environments could make use of such
> > announcements. It would be trivial to add.
> 
> Personally, I'be rather scared than delighted ;-)

So in data centers you don't even trust the machines in your broadcast
domain?

> > > Is there a possiblity to disable that at run time?
> > 
> > The feature is disabled by default. As long as you do not specify a
> > zeroconf_name in your configuration file, nothing happens. This is the
> > same behavior as established by the Bonjour code.
> 
> Thanks, good to know.
> 
> Isn't there a less-intrusive option to linking a lib into each and every
> possible server, like a config file in which to put what is to be announced?

You could directly talk to the D-Bus interface of Avahi. libavahi-client
just is a convenience wrapper. Well, but this route will be much more
cumbersome.

One other route is calling avahi-publish-service on startup and killing
it on shutdown, but: avahi-publish-service really only exists for
demonstration purposes and doesn't handle service name collisions for
instance. I don't believe that a high-profile application like
Postgresql should rely on low-quality hacks, like invoking educational
demo programs.

Ciao,
Mathias
-- 
Mathias Hasselmann <[EMAIL PROTECTED]>
http://www.openismus.com/ - We can get it done.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-01 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> On Tue, 2008-04-01 at 10:22 -0400, Tom Lane wrote:
>> In the fourth, if we actually believed this was a problem we'd
>> need to redesign VACUUM too, as it does the same thing.

> VACUUM waits until nobody else has the buffer pinned, so lock contention
> is much less of a consideration there. Plus it rearranges the block,
> which is hard to do one tuple at a time even if we wanted to.

That's the second scan.  The first scan acts exactly like Pavan is
proposing for ANALYZE.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Scroll cursor oddity...

2008-04-01 Thread Tom Lane

Mike Aubury <[EMAIL PROTECTED]> writes:
> ie - under postgresql it appears we've scrolled *past* the last row and need 
> an additional fetch to get back to our last row..

Why do you find that surprising?  It seems to me to be symmetrical with
the case at the beginning of the table --- the cursor is initially
positioned before the first row.  Why shouldn't there be a corresponding
state where it's positioned after the last row?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Tom Lane

Aidan Van Dyk <[EMAIL PROTECTED]> writes:
> * Andrew Dunstan <[EMAIL PROTECTED]> [080401 08:22]:
>> I don't know if this has come up before exactly, but is it possible that 
>> we could get a performance gain from building multiple indexes from a 
>> single sequential pass over the base table?

> I've not looked at any of the code, but would the "synchronized scans"
> heap machinery help the multiple index creations walk the heap together,
> basically giving you this for free (as long as you start concurrent
> index creation)?

Yeah, that should Just Work AFAICS.  Note also that this approach would
let you put multiple CPUs to work on the problem, whereas anything
involving stuffing multiple index creations into a single command
won't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-01 Thread Simon Riggs

On Tue, 2008-04-01 at 10:22 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:
> >> Please see the attached patch. One change I made is to hold the SHARE lock
> >> on the page while ANALYZE is reading tuples from it. I thought it would
> >> be a right thing to do instead of repeatedly acquiring/releasing the lock.
> 
> > ANALYZE is a secondary task and so we shouldn't be speeding it up at the
> > possible expense of other primary tasks. So I think holding locks for
> > longer than minimum is not good in this case and I see no reason to make
> > the change described.
> 
> I think Pavan's change is probably good.  In the first place, it's only
> a shared buffer lock and besides ANALYZE isn't going to be holding it
> long (all it's doing at this point is counting tuples and copying some
> of them into memory).  In the second place, repeated lock release and
> re-grab threatens cache line contention and a context swap storm if
> there is anyone else trying to access the page. In the third, whether
> there's contention or not the extra acquire/release work will cost CPU
> cycles.  In the fourth, if we actually believed this was a problem we'd
> need to redesign VACUUM too, as it does the same thing.

VACUUM waits until nobody else has the buffer pinned, so lock contention
is much less of a consideration there. Plus it rearranges the block,
which is hard to do one tuple at a time even if we wanted to.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread PFC


On Tue, 01 Apr 2008 16:06:01 +0200, Tom Lane <[EMAIL PROTECTED]> wrote:


Dave Cramer <[EMAIL PROTECTED]> writes:

Was the driver ever changed to take advantage of the above strategy?


Well, it's automatic as long as you use the unnamed statement.  About
all that might need to be done on the client side is to use unnamed
statements more often in preference to named ones, and I believe that
something like that did get done in JDBC.

regards, tom lane



PHP is also affected if you use pg_query_params...
Syntax : pg_query_params( "SQL with $ params", array( parameters )

Note that value is TEXT, indexed, there are 100K rows in table.

pg_query( "SELECT * FROM test WHERE id =12345" ); 1 rows in  
0.15931844711304 ms
pg_query( "SELECT * FROM test WHERE value LIKE '1234%'" ); 11 rows in  
0.26795864105225 ms


pg_query_params( "SELECT * FROM test WHERE id =$1", array( 12345 ) ); 1  
rows in 0.16618013381958 ms
pg_query_params( "SELECT * FROM test WHERE value LIKE $1", array( '1234%'  
)); 11 rows in 40.66633939743 ms


Last query does not use index.
However since noone uses pg_query_params in PHP (since PHP coders just  
LOVE to manually escape their strings, or worse use magicquotes), noone  
should notice ;)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Dave Cramer


So if I write

conn.prepareStatement("select col from table where col like ?")

then setString(1,'hello%')

The driver will do

prepare foo as select col from table where col like $1

and then

execute foo('hello%')

this will take advantage of the strategy automatically ?

If so this should be changed. The driver does this all the time.

Dave

On 1-Apr-08, at 10:06 AM, Tom Lane wrote:


Dave Cramer <[EMAIL PROTECTED]> writes:

Was the driver ever changed to take advantage of the above strategy?


Well, it's automatic as long as you use the unnamed statement.  About
all that might need to be done on the client side is to use unnamed
statements more often in preference to named ones, and I believe that
something like that did get done in JDBC.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-01 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:
>> Please see the attached patch. One change I made is to hold the SHARE lock
>> on the page while ANALYZE is reading tuples from it. I thought it would
>> be a right thing to do instead of repeatedly acquiring/releasing the lock.

> ANALYZE is a secondary task and so we shouldn't be speeding it up at the
> possible expense of other primary tasks. So I think holding locks for
> longer than minimum is not good in this case and I see no reason to make
> the change described.

I think Pavan's change is probably good.  In the first place, it's only
a shared buffer lock and besides ANALYZE isn't going to be holding it
long (all it's doing at this point is counting tuples and copying some
of them into memory).  In the second place, repeated lock release and
re-grab threatens cache line contention and a context swap storm if
there is anyone else trying to access the page.  In the third, whether
there's contention or not the extra acquire/release work will cost CPU
cycles.  In the fourth, if we actually believed this was a problem we'd
need to redesign VACUUM too, as it does the same thing.

I haven't read the patch yet, but I'm inclined to go with the design
Pavan suggests.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Michael Paesold


Am 01.04.2008 um 13:14 schrieb Dave Cramer:


On 1-Apr-08, at 6:25 AM, Michael Paesold wrote:



Am 01.04.2008 um 01:26 schrieb Tom Lane:
While testing the changes I was making to Pavel's EXECUTE USING  
patch

to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them.  For example

execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

...

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value.  I am
slightly hesitant to do this in a stable branch, since it would  
break
any third-party code that might be calling that function.  I doubt  
there
is currently any production-grade code doing so, but if anyone out  
there
is actively using those planner hooks we put into 8.3, it's  
conceivable

this would affect them.

Still, the performance regression here is bad enough that I think  
there

is little choice.  Comments/objections?


Yeah, please fix this performance regression in the 8.3 branch.  
This would affect most of the JDBC applications out there, I think.



Was the driver ever changed to take advantage of the above strategy?


IIRC, it is used in most cases with the v3 protocol, as long as you  
don't set a prepare-threshold.


Best Regards
Michael Paesold

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Tom Lane

Dave Cramer <[EMAIL PROTECTED]> writes:
> Was the driver ever changed to take advantage of the above strategy?

Well, it's automatic as long as you use the unnamed statement.  About
all that might need to be done on the client side is to use unnamed
statements more often in preference to named ones, and I believe that
something like that did get done in JDBC.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Andrew Dunstan




Aidan Van Dyk wrote:

* Andrew Dunstan <[EMAIL PROTECTED]> [080401 08:22]:
  

From the "idle thoughts in the middle of the night" department:

I don't know if this has come up before exactly, but is it possible that 
we could get a performance gain from building multiple indexes from a 
single sequential pass over the base table? If so, that would probably 
give us  a potential performance improvement in pg_restore quite apart 
from the projected improvement to be got from running several steps in 
parallel processes. The grammar might look a bit ugly, but I'm sure we 
could finesse that.



I've not looked at any of the code, but would the "synchronized scans"
heap machinery help the multiple index creations walk the heap together,
basically giving you this for free (as long as you start concurrent
index creation)?


  


Good question. Might it also help in that case to have pg_dump output 
indexes in a given schema sorted by  rather than 
just ?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Toru SHIMOGAKI



Andrew Dunstan wrote:

I don't know if this has come up before exactly, but is it possible that 
we could get a performance gain from building multiple indexes from a 
single sequential pass over the base table?


It is already implemented in pg_bulkload 
(http://pgbulkload.projects.postgresql.org/). Index tuples of multiple indexes 
are spooled during the single sequential pass over the base table, and the 
spooled index tuples are built up after all of the base table is scanned.


A proposal was submitted by Itagaki-san to integrate this feature into core.
see http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php .

--
Toru SHIMOGAKI<[EMAIL PROTECTED]>
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Scroll cursor oddity...

2008-04-01 Thread Mike Aubury


Does anyone know what the "correct" behaviour for a scroll cursor should be 
when you've scrolled past the end ?

If you take this SQL for example : 


   create temp table sometab ( a integer);
   insert into sometab values(1);
   insert into sometab values(2);
   insert into sometab values(3);
   begin work;

   declare c1 scroll cursor for select * from sometab;
   fetch next from c1;
   fetch next from c1;
   fetch next from c1;
   fetch next from c1;
   fetch prior from c1;
   fetch prior from c1;
   fetch prior from c1;




The first 4 fetches work as expected and return 1,2,3, and the 4th fetch 
returns no rows as its at the end of the list...

** But ** - when I do the fetch prior, I would have expected it to go back to 
the '2' row, not the '3' row...

ie - under postgresql it appears we've scrolled *past* the last row and need 
an additional fetch to get back to our last row..



For reference - heres what I get as output : 


CREATE TABLE
INSERT 32429 1
INSERT 32430 1
INSERT 32431 1
BEGIN
DECLARE CURSOR
 a
---
 1
(1 row)

 a
---
 2
(1 row)

 a
---
 3
(1 row)

 a
---
(0 rows)

 a
---
 3
(1 row)

 a
---
 2
(1 row)

 a
---
 1
(1 row)






TIA
-- 
Mike Aubury

Aubit Computing Ltd is registered in England and Wales, Number: 3112827
Registered Address : Clayton House,59 Piccadilly,Manchester,M1 2AQ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New boxes available for QA

2008-04-01 Thread Stephen Frost

Guillaume,

* Guillaume Smet ([EMAIL PROTECTED]) wrote:
> These servers are available 24/7 to PostgreSQL QA and won't be used
> for other purposes.

Awesome.

> Concerning the second point, I wonder if it's not worth it to have a
> very simple thing already reporting results as the development cycle
> for 8.4 has already started (perhaps several pgbench unit tests
> testing various type of queries with a daily tree). Thoughts?

It didn't occur to me before, but, if you've got a decent amount of disk
space and server time..

I'm almost done scripting up everything to load the TIGER/Line
Shapefiles from the US Census into PostgreSQL/PostGIS.  Once it's done
and working I would be happy to provide it to whomever asks, and it
might be an interesting data set to load/query and look at benchmarks
with.  There's alot of GIST index creation, as well as other indexes
like soundex(), and I'm planning to use partitioning of some sort for
the geocoder.  We could, for example, come up with some set of arbitrary
addresses to geocode and see what the performance of that is.

It's just a thought, and it's a large/"real" data set to play with.

The data set is 22G compressed shapefiles/dbf files.  Based on my
initial numers I think it'll grow to around 50G loaded into PostgreSQL
(I'll have better numbers later today).  You can get the files from
here: http://ftp2.census.gov/geo/tiger/TIGER2007FE/ Or, if you run into
a problem with that, I can provide a pretty fast site to pull them from
as well (15Mb/s).

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-01 Thread Stuart Brooks




 Please do --- I have a lot of other stuff on my plate.




Please see the attached patch. One change I made is to hold the SHARE lock
on the page while ANALYZE is reading tuples from it. I thought it would
be a right thing to do instead of repeatedly acquiring/releasing the lock.
  
I have applied the patch and have started my test again, it takes a 
little while to fill up so I should have the results sometime tomorrow.


Thanks for the quick response.
Stuart

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Aidan Van Dyk

* Andrew Dunstan <[EMAIL PROTECTED]> [080401 08:22]:
> 
> From the "idle thoughts in the middle of the night" department:
> 
> I don't know if this has come up before exactly, but is it possible that 
> we could get a performance gain from building multiple indexes from a 
> single sequential pass over the base table? If so, that would probably 
> give us  a potential performance improvement in pg_restore quite apart 
> from the projected improvement to be got from running several steps in 
> parallel processes. The grammar might look a bit ugly, but I'm sure we 
> could finesse that.

I've not looked at any of the code, but would the "synchronized scans"
heap machinery help the multiple index creations walk the heap together,
basically giving you this for free (as long as you start concurrent
index creation)?

a.

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Pavan Deolasee

On Tue, Apr 1, 2008 at 5:51 PM, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
>
>   From the "idle thoughts in the middle of the night" department:
>
>  I don't know if this has come up before exactly, but is it possible that
>  we could get a performance gain from building multiple indexes from a
>  single sequential pass over the base table?

http://archives.postgresql.org/pgsql-performance/2008-02/msg00236.php

IMHO it should be possible to extend the grammar  to add
multiple indexes in one go. But the current index build itself looks
very tightly integrated with the heap scan. So it might be tricky to
separate out the scan and the index building activity.

Thanks,
Pavan


-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] build multiple indexes in single table pass?

2008-04-01 Thread Andrew Dunstan



From the "idle thoughts in the middle of the night" department:

I don't know if this has come up before exactly, but is it possible that 
we could get a performance gain from building multiple indexes from a 
single sequential pass over the base table? If so, that would probably 
give us  a potential performance improvement in pg_restore quite apart 
from the projected improvement to be got from running several steps in 
parallel processes. The grammar might look a bit ugly, but I'm sure we 
could finesse that.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] actualized SQL/PSM patch

2008-04-01 Thread Jonah H. Harris

On Tue, Apr 1, 2008 at 6:23 AM, Pavel Stehule <[EMAIL PROTECTED]> wrote:
>  I can't to say so plpgpsm is an dialect of plpgsql. Minimally there
>  are different parser. I am sure so supported functions can be shared,
>  but it's mean really dramatic changes in plpgsql code.  I belive so
>  separated languages will be more maintainable.

I agree.  I think it should be a separate language as well.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-04-01 Thread tomas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, Apr 01, 2008 at 09:35:56AM +0200, Mathias Hasselmann wrote:
> Am Samstag, den 29.03.2008, 12:25 + schrieb [EMAIL PROTECTED]:
> [...]
> > Sorry for a dumb question, but I couldn't figure that out from your
> > references [1]..[4]: does that mean that the PostgreSQL server would
> > "advertise itself" on the local net? Or what is the purpose of liking-in
> > libavahi into the postmaster?
> 
> Yes, that's the purpose.
> 
> > Surely one wouldn't want this in a data center? 
> 
> Yes, this feature definitely targets small-office use, personal use, DB
> developers [...]
> Still you can tell Avahi to explicitly announce at a certain, non-local
> domain, but this feature is not implemented by the patch. Maybe database
> developers in large network environments could make use of such
> announcements. It would be trivial to add.

Personally, I'be rather scared than delighted ;-)

> > Is there a possiblity to disable that at run time?
> 
> The feature is disabled by default. As long as you do not specify a
> zeroconf_name in your configuration file, nothing happens. This is the
> same behavior as established by the Bonjour code.

Thanks, good to know.

Isn't there a less-intrusive option to linking a lib into each and every
possible server, like a config file in which to put what is to be announced?

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFH8iRFBcgs9XrR2kYRAmJ0AJkB7MkxfYI0nVa4RqHVEV1HYjz41gCdEgWz
YQ2T4Y/xfoLRF4D6hMLbpEk=
=Goho
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Dave Cramer



On 1-Apr-08, at 6:25 AM, Michael Paesold wrote:



Am 01.04.2008 um 01:26 schrieb Tom Lane:

While testing the changes I was making to Pavel's EXECUTE USING patch
to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them.  For example

execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

...

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value.  I am
slightly hesitant to do this in a stable branch, since it would break
any third-party code that might be calling that function.  I doubt  
there
is currently any production-grade code doing so, but if anyone out  
there
is actively using those planner hooks we put into 8.3, it's  
conceivable

this would affect them.

Still, the performance regression here is bad enough that I think  
there

is little choice.  Comments/objections?


Yeah, please fix this performance regression in the 8.3 branch. This  
would affect most of the JDBC applications out there, I think.



Was the driver ever changed to take advantage of the above strategy?


Dave

Best Regards
Michael Paesold

--
Sent via pgsql-jdbc mailing list ([EMAIL PROTECTED])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

2008-04-01 Thread Michael Paesold



Am 01.04.2008 um 01:26 schrieb Tom Lane:

While testing the changes I was making to Pavel's EXECUTE USING patch
to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them.  For example

execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

...

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value.  I am
slightly hesitant to do this in a stable branch, since it would break
any third-party code that might be calling that function.  I doubt  
there
is currently any production-grade code doing so, but if anyone out  
there
is actively using those planner hooks we put into 8.3, it's  
conceivable

this would affect them.

Still, the performance regression here is bad enough that I think  
there

is little choice.  Comments/objections?


Yeah, please fix this performance regression in the 8.3 branch. This  
would affect most of the JDBC applications out there, I think.


Best Regards
Michael Paesold

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] actualized SQL/PSM patch

2008-04-01 Thread Pavel Stehule

On 31/03/2008, Stephen Frost <[EMAIL PROTECTED]> wrote:
> Pavel,
>
>   Honestly, I havn't dug into the real patch all that deeply but I did
>   notice a few minor issues which I've listed out below.  The bigger
>   question I have for this patch, however, is just how close is it to
>   PL/pgSQL?  If the differences are minor and far between would it be
>   more reasonable to just make PL/pgSQL play double-duty and have a flag
>   somewhere to indicate when it should be in 'PL/pgPSM' mode?
>
>   Thanks.

Hello,

thank you for time. I thing so plpgsql is too much different language
than plpgpsm  - mainly there is different concept of catching errors,
cursor's declaration and operation and different statements. My tip:

gram.y - conformance 10%
pl_exec.c - conf. 40%
pl_func.c - conf   80%  (diff in dump functions)
scan.l - conf.  99%

I can't to say so plpgpsm is an dialect of plpgsql. Minimally there
are different parser. I am sure so supported functions can be shared,
but it's mean really dramatic changes in plpgsql code.  I belive so
separated languages will be more maintainable.

>
>  #1: INSTALL.plpgpsm starts out saying:
> "Installation of PL/pgSQL"
> I'm guessing you just missed changing it.  Also in there:
> "For installation any PL language you need superuser's rights."
> should probably read:
> For installation of any PL language you need superuser rights.
> Or just:
> To install any PL language you need to be the database superuser.
>
>  #2: pl_comp.c has a similar issue in its comments:
> pl_comp.c as the top says "Compiler part of the PL/pgSQL" ..
> plpgpsm_compile  Make an execution tree for a PL/pgSQL function.
> Should read 'PL/pgPSM' there.
>
>  #3: pl_comp.c uses C++ style comments for something which I'm guessing
> you didn't actually intend to even be in the patch:
> //elog(ERROR, "zatim konec");
> in do_compile().
>
>  #4: Again in pl_comp.c there are C++ style comments, this time for
> variables which can probably just be removed:
> //PLpgPSM_nsitem  *nse;
> //char *cp[1];
>
>  #5: In pl_exec.c, exec_stmt_open, again you have C++ style comments:
> // ToDo: Holdable cursors
>
>  #6: In the "expected.out", for the 'fx()' function, the CONTEXT says:
> CONTEXT:  compile of PL/pgSQL function "fx()" near line 2
> Even though it says "LANGUAGE plpgpsm", which seems rather odd.
>
>  #7: gram.y also has in the comments "Parser for the PL/pgSQL" ..
>
>  #8: plpgpsm_compile_error_callback() passes "PL/pgSQL" to errcontext(),
> probably the cause of #7 and fixing it and regenerating the expected
> output would probably work.
>
>  #9: plerrcodes.h also has "PL/pgSQL error codes" in the comments at the
> top.
>
>  #10: ditto for pl_exec.c "Executor for the PL/pgSQL" ..
>
>  #11: more error-strings being passed with "PL/pgSQL" in it in pl_exec.c:
>  in exec_stmt_prepare() and exec_prepare_plan(), exec_stmt_execute():
>  eg:
>  cannot COPY to/from client in PL/pgSQL
>  cannot begin/end transactions in PL/pgSQL
>  cannot manipulate cursors directly in PL/pgSQL
>
>  #12: Also in the comments for plpgpsm_estate_setup are references to
>  PL/pgSQL.
>
>  #13: pl_funcs.c also says "Misc functions for the PL/pgSQL" ..
>
>  #14: plpgpsqm_dumptree outputs:
>  Execution tree of successfully compiled PL/pgSQL function
>  Should be updated for PL/pgPSM
>
>  #15: Header comment in pl_handler.c also says PL/pgSQL
>
>  #16: Function-definition comment for plpgpsqm_call_handler also says
>  PL/pgSQL
>  ditto for plpgpsm_validator
>
>  #17: Header comment in plpgpsm.h say PL/pgSQL, other comments later as
>  well, such as for the PLpgPSM_plugin struct
>
>  #18: Also for the header comment in scan.l
>

I'll correct it, thank you very much

Pavel
> Enjoy,
>



>
> Stephen
>
> -BEGIN PGP SIGNATURE-
>  Version: GnuPG v1.4.6 (GNU/Linux)
>
>  iD8DBQFH8UGarzgMPqB3kigRAv2uAJ0RR2WA37Qx14Ty9mx3pzd6hbazqACfZaG1
>  NRxCF2vC9+BbVlSHM9swc1A=
>  =fFpD
>  -END PGP SIGNATURE-
>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-01 Thread Simon Riggs

On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:

> Please see the attached patch. One change I made is to hold the SHARE lock
> on the page while ANALYZE is reading tuples from it. I thought it would
> be a right thing to do instead of repeatedly acquiring/releasing the lock.

ANALYZE is a secondary task and so we shouldn't be speeding it up at the
possible expense of other primary tasks. So I think holding locks for
longer than minimum is not good in this case and I see no reason to make
the change described.

We can speed up ANALYZE by using the background reader to preread the
blocks, assuming bgreader will happen one day.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-04-01 Thread Mathias Hasselmann

Am Samstag, den 29.03.2008, 12:25 + schrieb [EMAIL PROTECTED]:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On Sat, Feb 23, 2008 at 01:13:38PM +0100, Mathias Hasselmann wrote:
> 
> [...]
> 
> > Avahi/Bonjour/DNS-SD support[1] is very important, for integrating
> > Postgresql with modern desktop environments like OSX, GNOME, KDE: It's
> > very convenient to choose active DBMS servers in your local network from
> > a list, instead of memorizing "cryptic" connection parameters. 
> 
> [...]
> 
> > People not wanting DNS-SD support for their server can easily control
> > that feature via the "--with-avahi" configure scripts.
> 
> Sorry for a dumb question, but I couldn't figure that out from your
> references [1]..[4]: does that mean that the PostgreSQL server would
> "advertise itself" on the local net? Or what is the purpose of liking-in
> libavahi into the postmaster?

Yes, that's the purpose.

> Surely one wouldn't want this in a data center? 

Yes, this feature definitely targets small-office use, personal use, DB
developers. Don't know enough about data centers to judge the impact
there, but since Avahi - as used in the patch - announces to the local
network only, the impact sould be small.

Still you can tell Avahi to explicitly announce at a certain, non-local
domain, but this feature is not implemented by the patch. Maybe database
developers in large network environments could make use of such
announcements. It would be trivial to add.

> Is there a possiblity to disable that at run time?

The feature is disabled by default. As long as you do not specify a
zeroconf_name in your configuration file, nothing happens. This is the
same behavior as established by the Bonjour code.

Ciao,
Mathias
-- 
Mathias Hasselmann <[EMAIL PROTECTED]>
http://www.openismus.com/ - We can get it done.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-01 Thread Pavan Deolasee

On Tue, Apr 1, 2008 at 1:22 AM, Tom Lane <[EMAIL PROTECTED]> wrote:

>
>  Please do --- I have a lot of other stuff on my plate.
>

Please see the attached patch. One change I made is to hold the SHARE lock
on the page while ANALYZE is reading tuples from it. I thought it would
be a right thing to do instead of repeatedly acquiring/releasing the lock.

Another thing I noticed while working on this is VACUUM probably reports the
number of dead tuples incorrectly. We don't count the DEAD line pointers as
"tups_vacuumed" which is fine if the line pointer was marked DEAD in the
immediately preceding heap_page_prune(). In that case the DEAD line pointer
is counted in "ndeleted" count returned by heap_page_prune(). But it fails to
count already DEAD line pointers.

For example

postgres=# CREATE TABLE test (a int, b char(500));
CREATE TABLE
postgres=# INSERT INTO test VALUES (generate_series(1,15),'foo');
INSERT 0 15
postgres=# DELETE FROM test;
DELETE 15
postgres=# select count(*) from test;
 count
---
 0
(1 row)

postgres=# VACUUM VERBOSE test;
INFO:  vacuuming "public.test"
INFO:  "test": removed 0 row versions in 1 pages
INFO:  "test": found 0 removable, 0 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "test": truncated 1 to 0 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

So VACUUM reports "zero" dead row versions which may seem
counter-intuitive especially in the autovac log message (as
someone may wonder why autovac got triggered on the table)

I am thinking we can make heap_page_prune() to only return
number of HOT tuples pruned and then explicitly count the DEAD
line pointers in tups_vacuumed.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Analyze-fix.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

63 matches

Mail list logo