Re: [HACKERS] "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

2008-03-03 Thread Heikki Linnakangas

Tom Lane wrote:

I wrote:

I think we need some better means of recording whether a lock is on a
temp object.  We could certainly add a flag to the LOCALLOCK struct,
but it's not clear where a clean place to set it would be.  As a rule
we don't yet know when locking a relation whether it's temp or not.


Actually ... why are we using the lock manager to drive this at all?


Good question. It has always seemed a bit strange to me. The assumption 
that we always hold the lock on temp table until end of transaction, 
while true today, seems weak to me.



Temp-ness of relations is not really something that it has any interest
in.  What if we get rid of LockTagIsTemp altogether, and instead protect
2PC transactions by having a global flag "transactionUsedTempTable"?
We could clear that at transaction start, and conditionally set it in
relation_open, for very little cost.


That certainly seems like the simplest and most robust solution.

There's this corner case where that would behave differently than the 
lock manager approach:


BEGIN;
SAVEPOINT sp;
CREATE TEMP TABLE foo(bar int4);
ROLLBACK TO sp;
PREPARE TRANSACTION 'foo';

The flag would have to be per-subxact to avoid that, though I doubt 
anyone is relying on that behavior.


In the future, it would be nice to relax the restriction on using temp 
rels, though. A flag doesn't lend itself to that easily, but I'm sure 
we'll figure out something if we ever get around to implement that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] Read-ahead and parallelism in redo recovery

2008-03-03 Thread Heikki Linnakangas

Aidan Van Dyk wrote:

How difficult is it to parse the WAL logs with enough knowledge to know
what heap page (file/offset) a wal record contains (I haven't looked
into any wal code)?


Unfortunately there's no common format for that. All the heap-related 
WAL records, insert, update and delete, have a 
RelFileNode+ItemPointerData at the beginning of the WAL payload, but 
update records have another ItemPointerData for the tid of the new tuple 
in addition to that. And all indexam WAL records use a format of their own.


It would be nice to refactor that so that there was a common format to 
store the file+block number touched by WAL record. Like we have for full 
page images. That would useful for all kinds of external tools to parse 
WAL files, like the read-ahead restore_command you envisioned.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

2008-03-03 Thread Alvaro Herrera
Heikki Linnakangas escribió:

> In the future, it would be nice to relax the restriction on using temp  
> rels, though. A flag doesn't lend itself to that easily, but I'm sure  
> we'll figure out something if we ever get around to implement that.

I can't recall the rationale for this limitation.  Do we need anything
beyond flushing the table's buffers to disk?  That sounds an easy thing
to implement.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] "could not open relation 1663/16384/16584: No suchfile or directory" in a specific combination of transactions withtemp tables

2008-03-03 Thread Heikki Linnakangas

Alvaro Herrera wrote:

Heikki Linnakangas escribió:

In the future, it would be nice to relax the restriction on using temp  
rels, though. A flag doesn't lend itself to that easily, but I'm sure  
we'll figure out something if we ever get around to implement that.


I can't recall the rationale for this limitation.  Do we need anything
beyond flushing the table's buffers to disk?  That sounds an easy thing
to implement.


See http://archives.postgresql.org/pgsql-hackers/2005-05/msg01223.php

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


[HACKERS] Patch application emails

2008-03-03 Thread Bruce Momjian
Reminder to patch appliers,

If you apply a patch that was discussed on patches or hackers, please
reply to the email indicating you have applied the patch.  I can connect
the commit message to the email discussion but often the patch submitter
does not read committers so it would be good for them to get feedback
that their patch was applied, and to which branches.

If you are applying your own patch, and there was email discussion about
it, please do the same.

Thanks.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

2008-03-03 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Actually ... why are we using the lock manager to drive this at all?

> Good question. It has always seemed a bit strange to me. The assumption 
> that we always hold the lock on temp table until end of transaction, 
> while true today, seems weak to me.

Looking back, I think it was driven by the desire to tie the behavior
directly to things that are going to get persisted, such as locks.
>From that standpoint your initial patch to attach a temp-check to
relation-drop 2PC entries would be the right kind of design.  However,
given what we now know about the lock situation, I'd feel uncomfortable
with applying that without also fixing LockTagIsTemp, and right now
that's looking like much more complexity and possible performance
penalty than it's worth.

> In the future, it would be nice to relax the restriction on using temp 
> rels, though. A flag doesn't lend itself to that easily, but I'm sure 
> we'll figure out something if we ever get around to implement that.

Yeah.  As you already noted, there are several other problems that would
have to be dealt with to support that, so we can just leave this as
another one.

Do you want to write up a flag-based patch, or shall I?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


[HACKERS] Query Builder in pgAdmin for GSoC 2008

2008-03-03 Thread Robins Tharakan
Hi,

I read about Google Summer of Code recently and being involved with pgAdmin
since a few weeks, would like to add the Query Builder feature to pgAdmin,
as a part of this programme.

I have been reading GSoC related emails in the past year on various PG
lists, and believe that keeping a project realistic is as important as it
being ambitious. My current time therefore, is spent in reading about
wxWidgets and its graphical capabilities, as well as how comfortable I need
to get to be able to start working on it.

I understand that someone was interested in looking into the query builder
at FOSDEM earlier. Since I am still into reading, I thought this probably is
the right time to ask. Is anyone working on a Query Builder for pgAdmin yet
?

Regards,
*Robins Tharakan*


Re: [HACKERS] Query Builder in pgAdmin for GSoC 2008

2008-03-03 Thread Robins Tharakan
Sorry!
As rightly pointed out, guess I should have posted this to pgadmin-hackers
instead.

*Robins Tharakan*

On Mon, Mar 3, 2008 at 11:19 PM, Robins Tharakan <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I read about Google Summer of Code recently and being involved with
> pgAdmin since a few weeks, would like to add the Query Builder feature to
> pgAdmin, as a part of this programme.
>
> I have been reading GSoC related emails in the past year on various PG
> lists, and believe that keeping a project realistic is as important as it
> being ambitious. My current time therefore, is spent in reading about
> wxWidgets and its graphical capabilities, as well as how comfortable I need
> to get to be able to start working on it.
>
> I understand that someone was interested in looking into the query builder
> at FOSDEM earlier. Since I am still into reading, I thought this probably
> is the right time to ask. Is anyone working on a Query Builder for pgAdmin
> yet ?
>
> Regards,
> *Robins Tharakan*


Re: [HACKERS] Read-ahead and parallelism in redo recovery

2008-03-03 Thread Bruce Momjian

I have added the following TODO:

* Speed WAL recovery by allowing more than one page to be prefetched

  This involves having a separate process that can be told which pages
  the recovery process will need in the near future.
  http://archives.postgresql.org/pgsql-hackers/2008-02/msg01279.php


---

Heikki Linnakangas wrote:
> Aidan Van Dyk wrote:
> > How difficult is it to parse the WAL logs with enough knowledge to know
> > what heap page (file/offset) a wal record contains (I haven't looked
> > into any wal code)?
> 
> Unfortunately there's no common format for that. All the heap-related 
> WAL records, insert, update and delete, have a 
> RelFileNode+ItemPointerData at the beginning of the WAL payload, but 
> update records have another ItemPointerData for the tid of the new tuple 
> in addition to that. And all indexam WAL records use a format of their own.
> 
> It would be nice to refactor that so that there was a common format to 
> store the file+block number touched by WAL record. Like we have for full 
> page images. That would useful for all kinds of external tools to parse 
> WAL files, like the read-ahead restore_command you envisioned.
> 
> -- 
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your Subscription:
> http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] Read-ahead and parallelism in redo recovery

2008-03-03 Thread Bruce Momjian
Florian G. Pflug wrote:
> Greg Stark wrote:
> > Florian G. Pflug wrote:
> >> The same holds true for index scans, though. Maybe we can find a 
> >> solution that benefits both cases - something along the line of a 
> >> bgreader process
> > I posted a patch to do readahead for bitmap index scans using 
> > posix_fadvise. Experiments showed it works great on raid arrays on 
> > Linux. Solaris will need to use libaio though which I haven't tried 
> > yet.
> Cool! I'd like to try it out - is that patch available in the pg-patches
> archives?
> 
> > Doing it for normal index scans is much much harder. You can 
> > readahead a single page by using the next pointer if it looks like 
> > you'll need it. But I don't see a convenient way to get more than 
> > that.
> I was thinking that after reading a page from the index, the backend
> could post a list of heap pages referenced from that index page to the
> shmem. A background process would repeatedly scan that list, and load
> those pages into the buffer cache.

Agreed.  Lots of database do the index/heap readahead via threads --- I
think we will probably use a separate read-ahead process that knows more
about all the concurrent reads and the tablespaces involved.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x

2008-03-03 Thread Bruce Momjian

Added to TODO:

* Allow the UUID type to accept non-standard formats

  http://archives.postgresql.org/pgsql-hackers/2008-02/msg01214.php


---

Dawid Kuroczko wrote:
> Hello.
> 
> I am currently playing with UUID data type and try to use it to store provided
> by third party (Hewlett-Packard) application.  The problem is they
> format UUIDs as
> -------, so I have to
> replace(text,'-','')::uuid for
> this kind of data.
> 
> Nooow, the case is quite simple and it might be that there are other
> applications
> formatting UUIDs too liberally.
> 
> I am working on a patch to support this format (yes, it is a simple
> modification).
> 
> And in the meanwhile I would like to ask you what do you think about it?
> 
> Cons: Such format is not standard.
> 
> Pros: This will help UUID data type adoption. [1]  While good
> applications format
> their data well, there are others which don't follow standards.  Also
> I think it is
> easier for a human being to enter UUID as 8 times 4 digits.
> 
> Your thoughts?  Should I submit a patch?
> 
>Regards,
>  Dawid
> 
> [1]: My first thought when I received the error message was "hey! this
> is not an UUID,
> it is too long/too short!", only later did I check that they just
> don't format it too well.
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] Snapshot Reuse

2008-03-03 Thread Bruce Momjian

I certainly would be interested to see if it improves performance.

---

Simon Riggs wrote:
> In Read Committed transactions we take snapshots much more frequently
> than transactions begin and commit. It would be help scalability if we
> didn't need to re-take a snapshot. That's only helpful if the chances of
> seeing the snapshot is relatively high.
> 
> Now that we have virtual transactions we may more frequently find
> ourselves taking identical snapshots.
> 
> If we had a counter that incremented each time the main snapshot altered
> in a meaningful way we could set that atomically. We could then read
> this when we take a snapshot to see if it matches our existing snapshot;
> if so then drop the lock quickly and continue with what we already have.
> 
> I can see some downsides to this as well as potential benefits:
> 
> * we ping the counter across CPUs - yes, we will, but that's probably
> better than pinging the whole procarray
> 
> * this relies upon the rate of change of snapshots - need to do the math
> to see how often this might apply
> 
> Not sure yet myself, but it seems worth recording in case it spurs an
> idea from someone else.
> 
> -- 
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com 
> 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] proposal: plpgsql return execute ...

2008-03-03 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Pavel Stehule wrote:
> On 27/02/2008, Merlin Moncure <[EMAIL PROTECTED]> wrote:
> > On Wed, Feb 27, 2008 at 1:58 PM, Neil Conway <[EMAIL PROTECTED]> wrote:
> >  > On Wed, 2008-02-27 at 15:24 +0100, Pavel Stehule wrote:
> >  >  > I thing RETURN QUERY is successful idea. It should be completed with
> >  >  > support  of dynamic SQL.
> >  >
> >  >  Yeah, I can see that being useful.
> >  >
> >  >  > RETURN EXECUTE sqlstring [USING];
> >  >
> >  >  What is the USING clause for?
> >
> >
> > parameter binding.
> >
> >  see: http://archives.postgresql.org/pgsql-patches/2006-08/msg00267.php
> >
> >
> >  merlin
> >
> 
> I sent modernized version
> http://archives.postgresql.org/pgsql-patches/2007-10/msg00161.php
> 
> Pavel
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

2008-03-03 Thread Heikki Linnakangas

Tom Lane wrote:

Do you want to write up a flag-based patch, or shall I?


I can do that.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


[HACKERS] Google Summer of Code 2008

2008-03-03 Thread Jan Urbański

Hi PostgreSQL!

Although this year's GSoC is just starting, I thought getting in touch a bit
earlier would only be of benefit.

I study Computer Science in Faculty of Mathematics, Informatics
and Mechanics of Warsaw University. I'm currently in my fourth year of
studies. Having chosen Databases for my degree course I plan to write my thesis
concentrating at least partially on PostgreSQL. This will (hopefully) be my
first GSoC.

For the past one and a half years I've alse been working in a privately held
company Fiok LLP. The company deals, among others, in developing custom
Web applications, which all use PostgreSQL as their database solution. During my
time in Fiok I have taken part in creating an accounting system for a large
Polish university, capable of generating financial reports required by the
European Union, a publishing platform for editors working in the Polish
Catholic Press Agency and a custom tailored CRM application.

All of these projects use unique PostgreSQL features, like PITR and full-text
search to name a few. You can glimpse the implemented FTS functionality by
looking here:
http://system.ekai.pl/kair/?_tw_DepeszeKlientaTable_0__search_plainfulltext=kalendarz&_tw_DepeszeKlientaTable_0__search_rank_orderby=on&screen=depesze
It's the public part of the publishing platform, which allows subscribed
readers to view published messages. The link takes you to search results for
the word 'kalendarz' (which is Polish for calendar), ordered by rank() and
highlighted by headline() (our client uses 8.2, hence the old function names).

I do my work in Fiok almost exclusively from home, showing up at the office
once every two or three weeks, so working in a distributed environment using
SCM tools is natural to me.

I'm also engaged in an open source project called Kato, being one of the key
developers. It's a small project that started as my company's requirement for a
new Web application framework and ended up being released under the New BSD
License. Of course it's native database engine is PostgreSQL. You can take a
look at the source here:
http://kato.googlecode.com/
or play around with a simple demo here:
http://sahara.fiok.pl/~jurbanski/kato-demo/kato-demo.en.php

Speaking of open source contributions, I also wrote a FTS-related patch for 
Postgres, that
made it's way into 8.3:
http://archives.postgresql.org/pgsql-patches/2007-11/msg00081.php

I try to follow -patches, occasinally read -hackers and sometimes make
excursions around the pgsql source, trying to learn more and more of it.

About my programming skills, particulary in C - one piece of code I'd like to 
show
you was written for an Operating Systems course. It's a kernel patch
implementing I/O operations throttling on a per-process basis through a /proc
based interface. The code lacks comments, as they were in Polish, but it's just
to assure you I'm able to write some good C:
http://students.mimuw.edu.pl/~wulczer/linux-2.6.17.13-iolimits-ju219721.patch

And now for the SoC. As this year's PostgreSQL Ideas are not set up yet, I
thought I'd give you the two projects floating through my mind

1. WAL segment files explorer / mangler

While preparing a presentation about PITR and warm stanby in PostgreSQL for my
degree course, I thought it would be nice if one had a command-line tool to
examine the contents of a WAL segment file and determine for example what
commands were recorded in it, what are the transaction IDs they were in,
etc. This could allow for instance to replay the WAL sequence up until a
function went haywire and wrecked one's data - without the need to know *when*
the accident happened. It could be useful as an alternative method of logging
operations - since WAL files are written anyway, one could imagine a process
periodically looking through them and reporting (perheaps not all) operations
to some external listener. If for instance you were curious which column in a
table is updated most, instead of writing a trigger to log updates to it, you
could use the WAL explorer to find updates to that column and log them over the
network, thus reducing disk I/O.
Being even bolder, I thought about allowing to edit the contents of a WAL file,
so if the proverbial junior DBA drops a crucial table and gets caught the next
morning, you don't have to throw away all transactions that got commited over
the night. Maybe you could *overwrite* his DROP TABLE with something neutral
and replay the WAL up to it's end.

2. Implement better selectivity estimates for FTS.

If I'm not mistaken, the @@ operator still uses the contsel selectivity
function, returning 0.001 *  as the expected number of rows
left after applying the @@ operator. I have in the past been bitten by
performance problems that I think could be traced back to row count estimates
being horribly wrong (i.e. much too low) for FTS queries asking for a very
popular word. Maybe we could do better that just return one-thousandth?

I myself am more for the first idea, but both seem g

Re: [HACKERS] proposal: plpgsql return execute ...

2008-03-03 Thread Pavel Stehule
Hello

I have to write this patch first. I am waiting for apply EXECUTE USING
patch and others plpgsql patches.

Regards
Pavel Stehule

On 03/03/2008, Bruce Momjian <[EMAIL PROTECTED]> wrote:
>
>  Your patch has been added to the PostgreSQL unapplied patches list at:
>
> http://momjian.postgresql.org/cgi-bin/pgpatches
>
>  It will be applied as soon as one of the PostgreSQL committers reviews
>  and approves it.
>
>  ---
>
>
>
>  Pavel Stehule wrote:
>  > On 27/02/2008, Merlin Moncure <[EMAIL PROTECTED]> wrote:
>  > > On Wed, Feb 27, 2008 at 1:58 PM, Neil Conway <[EMAIL PROTECTED]> wrote:
>  > >  > On Wed, 2008-02-27 at 15:24 +0100, Pavel Stehule wrote:
>  > >  >  > I thing RETURN QUERY is successful idea. It should be completed 
> with
>  > >  >  > support  of dynamic SQL.
>  > >  >
>  > >  >  Yeah, I can see that being useful.
>  > >  >
>  > >  >  > RETURN EXECUTE sqlstring [USING];
>  > >  >
>  > >  >  What is the USING clause for?
>  > >
>  > >
>  > > parameter binding.
>  > >
>  > >  see: http://archives.postgresql.org/pgsql-patches/2006-08/msg00267.php
>  > >
>  > >
>  > >  merlin
>  > >
>  >
>  > I sent modernized version
>  > http://archives.postgresql.org/pgsql-patches/2007-10/msg00161.php
>  >
>  > Pavel
>  >
>
> > ---(end of broadcast)---
>  > TIP 7: You can help support the PostgreSQL project by donating at
>  >
>  > http://www.postgresql.org/about/donate
>
>
>  --
>   Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
>   EnterpriseDB http://postgres.enterprisedb.com
>
>   + If your life is a hard drive, Christ can be your backup. +
>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] A couple of PG schedule reminders

2008-03-03 Thread Merlin Moncure
On Thu, Feb 28, 2008 at 10:23 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> Don't forget that our first "commit fest" for 8.4 development starts
>  Saturday March 1.  If you've got a patch that you'd like reviewed in
>  this fest, it's time to send it in.  (But there's plenty more fests
>  to come, so don't panic about it.)
>
>  Also, 8.3.0 has been out for almost a month, so it's time to start
>  thinking about 8.3.1.  The current plan is to freeze 8.3.1 on Thursday
>  Mar 13 for public announcement Monday Mar 17; though of course this
>  could change in the event of discovery of disastrous bugs.
>
>  A note about the relationship of these events: IMHO commit fest
>  should not affect our handling of maintenance of existing releases.
>  Investigation and patching of bugs that are in the field has always
>  had high priority regardless of where we are in a development/release
>  cycle, and I don't see commit fests as changing that.  But if anyone
>  wants to argue for a different policy, feel free to do so...

I am interested in securing a review/discussion for the libpq
parameter stuff.  See:
(http://momjian.us/mhonarc/patches_hold/thrd6.html) -> PGparam.
http://archives.free.net.ph/message/20080111.184825.26caadc0.en.html

The code is functional, tested, documented, and (unfortunately) not
very well discussed.  It's a fairly big patch with various
implementation compromises so maybe a reviewer guide is in order.  A
new version with some very minor bug fixes is coming soon.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] proposal: plpgsql return execute ...

2008-03-03 Thread Bruce Momjian
Pavel Stehule wrote:
> Hello
> 
> I have to write this patch first. I am waiting for apply EXECUTE USING
> patch and others plpgsql patches.

OK, removed.  Thanks.

---


> 
> Regards
> Pavel Stehule
> 
> On 03/03/2008, Bruce Momjian <[EMAIL PROTECTED]> wrote:
> >
> >  Your patch has been added to the PostgreSQL unapplied patches list at:
> >
> > http://momjian.postgresql.org/cgi-bin/pgpatches
> >
> >  It will be applied as soon as one of the PostgreSQL committers reviews
> >  and approves it.
> >
> >  ---
> >
> >
> >
> >  Pavel Stehule wrote:
> >  > On 27/02/2008, Merlin Moncure <[EMAIL PROTECTED]> wrote:
> >  > > On Wed, Feb 27, 2008 at 1:58 PM, Neil Conway <[EMAIL PROTECTED]> wrote:
> >  > >  > On Wed, 2008-02-27 at 15:24 +0100, Pavel Stehule wrote:
> >  > >  >  > I thing RETURN QUERY is successful idea. It should be completed 
> > with
> >  > >  >  > support  of dynamic SQL.
> >  > >  >
> >  > >  >  Yeah, I can see that being useful.
> >  > >  >
> >  > >  >  > RETURN EXECUTE sqlstring [USING];
> >  > >  >
> >  > >  >  What is the USING clause for?
> >  > >
> >  > >
> >  > > parameter binding.
> >  > >
> >  > >  see: http://archives.postgresql.org/pgsql-patches/2006-08/msg00267.php
> >  > >
> >  > >
> >  > >  merlin
> >  > >
> >  >
> >  > I sent modernized version
> >  > http://archives.postgresql.org/pgsql-patches/2007-10/msg00161.php
> >  >
> >  > Pavel
> >  >
> >
> > > ---(end of broadcast)---
> >  > TIP 7: You can help support the PostgreSQL project by donating at
> >  >
> >  > http://www.postgresql.org/about/donate
> >
> >
> >  --
> >   Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
> >   EnterpriseDB http://postgres.enterprisedb.com
> >
> >   + If your life is a hard drive, Christ can be your backup. +
> >
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your Subscription:
> http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


[HACKERS] HOT and autovacuum

2008-03-03 Thread Alvaro Herrera
Hi,

I'm seeing a 8.3 database whose autovacuum-initiated vacuums are being
cancelled and I am not sure of the cause.  I am wondering if a HOT
cleanup round on a page could try to acquire locks on it that would
conflict with autovacuum (basically anything that conflicts with a
vacuum lock).  This would cause autovacuum to commit hara-kiri.

If the table is frequently updated and HOT is invoked often, this could
be very frequent.  The auto-cancel feature of autovacuum is IMHO, in
this case, not welcome.  Perhaps we could teach autovac to not cancel
itself if the other locker is a HOT page prune?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


[HACKERS] libpq type system 0.9a

2008-03-03 Thread Merlin Moncure
The latest version of libpq type system is available here:
http://www.esilo.com/projects/postgresql/libpq/typesys-0.9a.tar.gz

The following modifications where made:
*) documentation fixes
*) parameter resets are no longer automatic
*) updated to patch vs. REL8_3_STABLE

Merlin Moncure & Andrew Chernow
eSilo

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] insert ... delete ... returning ... ?

2008-03-03 Thread Bruce Momjian

Added to TODO:

* Allow INSERT ... DELETE ... RETURNING, namely allow the DELETE ...
  RETURNING to supply values to the INSERT
  http://archives.postgresql.org/pgsql-hackers/2008-02/thrd2.php#00979

---

Jonah H. Harris wrote:
> On Sun, Feb 24, 2008 at 5:44 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> > "Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> >  > Not stupid, it doesn't work :)  This was a limitation of the original
> >  > design based on (IIRC) executor-related issues.
> >
> >  There are definitional issues not only implementation ones; in
> >  particular, in subquery-like cases it's entirely unclear how many times
> >  the DML operation will or should get evaluated.
> 
> Yup,that's what it was.  I think I remember the trigger-level and
> top-level executor-related stuff.  If I'm in that area of the code
> soon, I'll see how much would be involved and if I think I have enough
> time, submit a proposal for it.
> 
> -- 
> Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
> EnterpriseDB Corporation| fax: 732.331.1301
> 499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
> Edison, NJ 08837| http://www.enterprisedb.com/
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] A couple of PG schedule reminders

2008-03-03 Thread Josh Berkus
Tom, Bruce,

Is there a list somewhere of the patches submitted for the first commit 
fest?  I'd like to wiki-ize them so we can all spectate and take bets.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] A couple of PG schedule reminders

2008-03-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 3 Mar 2008 13:55:51 -0800
Josh Berkus <[EMAIL PROTECTED]> wrote:

> Tom, Bruce,
> 
> Is there a list somewhere of the patches submitted for the first
> commit fest?  I'd like to wiki-ize them so we can all spectate and
> take bets.
> 

http://developer.postgresql.org/index.php/Todo:PatchStatus

?

Or are you looking for something else?

Joshua D. Drake

- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHzHVLATb/zqfZUUQRAi4aAJ9KOIheaV2qKdM0or3twfqFqxEJfgCfcgDg
D/CDvedP8pp2WflIEhLFVtY=
=SY/A
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


[HACKERS] RFD: hexstring(n) data type

2008-03-03 Thread Dawid Kuroczko
Following the discussion on making UUID data type to be much more liberal
( http://archives.postgresql.org/pgsql-hackers/2008-02/msg01214.php )
I have decided to try to approach it from more general perspective.

The current state of code is available at:
http://qnex.net/hexstring-0.1-2008-03-03.tgz


And now for more details:

The idea is to have a data type HEXSTRING(n) which can have an optional
typemod specifying the size of data (in bytes).

Internally the datatype is binary compatible with bytea, but I/O is done as
hex-encoded strings.  The format is liberal, isspace() and ispunct() characters
are skipped while the digits are read.

I have played with two versions of hexstringin() function, one which uses
strtoul() function and the other which uses "home brew" code.  The latter
appears to be faster, so I stayed with that.  But I would appreciate
comments on this from more experienced.

So, what are the use cases?

CREATE DOMAIN liberal_uuid AS hexstring(16);
CREATE DOMAIN liberal_macaddr AS hexstring(6);

...it allows for creating other standard hex-types, as for example:
CREATE DOMAIN wwn AS hexstring(8); --
http://en.wikipedia.org/wiki/World_Wide_Name

Also it can be a convenient alternative to bytea format (I know, the
encode()/decode() pair),
especially when you have to format output data as some fancy hex-string.

The code is currently just a bunch of input/output/typemod functions
which appear
to work.  I will add casts, operators, etc -- they most likely will be
nicked from bytea.

What I would like to also add is ubiquitous to_char(hex, format) function.
For an UUID-compatilbe format it would be called as:
SELECT to_char(hex, '----') or
SELECT to_char(hex, '8x-4x-4x-4x-12x') where x is expanded as
[0-9a-f] digit and X is expanded as [0-9A-F].
I am not sure what to do about variable length hexstrings, I am
considering something
like to_char(hex, '8X-') which would produce something like
'--'
for a 12-byte hexstring (what to do about dangling '-' ?).

...but the original case against liberal UUID was that it would make
the I/O slower.
My simple test:

postgres=# CREATE TABLE uuids AS SELECT uuid_generate_v4()::text AS u
FROM generate_series(1,1000);
CREATE

postgres=# CREATE TEMP TABLE text ON COMMIT DROP AS SELECT
u::hexstring(16) FROM uuids;
SELECT
Time: 13058.486 ms
postgres=# CREATE TEMP TABLE test ON COMMIT DROP AS SELECT u::uuid FROM uuids;
SELECT
Time: 13536.816 ms

...now hexstring is varlena type but does not use strtoul.  Perhaps
uuid might be more liberal too.

What do you think about it?

   Regards,
 Dawid

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] A couple of PG schedule reminders

2008-03-03 Thread Bruce Momjian
Josh Berkus wrote:
> Tom, Bruce,
> 
> Is there a list somewhere of the patches submitted for the first commit 
> fest?  I'd like to wiki-ize them so we can all spectate and take bets.

No, I am still going through the stuff from last week and have not
started on the patches_hold emails yet.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] A couple of PG schedule reminders

2008-03-03 Thread Stefan Kaltenbrunner

Joshua D. Drake wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 3 Mar 2008 13:55:51 -0800
Josh Berkus <[EMAIL PROTECTED]> wrote:


Tom, Bruce,

Is there a list somewhere of the patches submitted for the first
commit fest?  I'd like to wiki-ize them so we can all spectate and
take bets.



http://developer.postgresql.org/index.php/Todo:PatchStatus


that one is in serious need of some updating which I have not found any 
time for lately :-(



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] A couple of PG schedule reminders

2008-03-03 Thread Josh Berkus
Stefan,

> > http://developer.postgresql.org/index.php/Todo:PatchStatus
>
> that one is in serious need of some updating which I have not found any
> time for lately :-(

Right.  I'll happily update it if Bruce will supply me the raw data.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] Google Summer of Code 2008

2008-03-03 Thread Tom Lane
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]> writes:
> 2. Implement better selectivity estimates for FTS.

+1 for that one ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] HOT and autovacuum

2008-03-03 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I'm seeing a 8.3 database whose autovacuum-initiated vacuums are being
> cancelled and I am not sure of the cause.  I am wondering if a HOT
> cleanup round on a page could try to acquire locks on it that would
> conflict with autovacuum (basically anything that conflicts with a
> vacuum lock).  This would cause autovacuum to commit hara-kiri.

This is nonsense, no?  A buffer cleanup lock is an LWLock not a lock
manager lock.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] A couple of PG schedule reminders

2008-03-03 Thread Andrew Dunstan



Josh Berkus wrote:

Stefan,

  

http://developer.postgresql.org/index.php/Todo:PatchStatus
  

that one is in serious need of some updating which I have not found any
time for lately :-(



Right.  I'll happily update it if Bruce will supply me the raw data.

  


My time right now is limited, but I can review Heikki's two COPY patches 
unless someone else has claimed them.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

2008-03-03 Thread John Smith
On Mon, Mar 3, 2008 at 8:46 AM, Heikki Linnakangas
<[EMAIL PROTECTED]> wrote:
> Tom Lane wrote:
>  > Do you want to write up a flag-based patch, or shall I?
>
>  I can do that.

BTW, I found a easier way of reproducing this (see attached 2pc.sql).
It might help with debugging or verifying a fix/regression.

Thanks for the attention so far, but I have some questions about the issue:

[1] The data file is reported missing in the second transaction only
if the first transaction was ended using PREPARE TRANSACTION. The
error does not show up if a direct COMMIT is performed (commit.sql)
instead of PREPARE TRANSACTION + COMMIT PREPARED. Why is that so?

[2] From all of the discussion here since my first post, I understand
that there are complications for session-level TEMP tables. But is it
easier to support PREPARE TRANSACTION for transactions that create and
drop their TEMP tables, i.e., so that the tables are not session-level
but just transaction-level?

[3] I am not certain how widespread they might be, but I think there
may be some backward compatibility concerns with the patch you are
proposing. On the one hand, the documentation says, "It is not
currently allowed to PREPARE a transaction that has executed any
operations involving temporary tables or created any cursors WITH
HOLD." But temporary tables that are created ON COMMIT DROP are more
like cursors that do not have WITH HOLD specified. So it does not seem
clear from the documentation that PREPARE TRANSACTION is not
supported, and indeed due to the lack of a check in Postgres today, it
seems as though it is supported. Do you think there is a risk in
breaking applications?

Thanks.
- John


2pc.sql
Description: Binary data


commit.sql
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] pg_dump additional options for performance

2008-03-03 Thread Bruce Momjian
Decibel! wrote:
> On Feb 26, 2008, at 4:36 PM, Tom Lane wrote:
> > I think a sane way to think about what Simon would like to accomplish
> > is not "turn psql into a parallel job scheduler"
> 
> 
> My $0.02: I often find myself wishing I could perform parallel  
> operations in psql. There was a proposal for that that came up during  
> 8.3 development; whatever happened to it?

The concurrent psql patch was never updated to an acceptable state for
it to be reviewed.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] pg_dump additional options for performance

2008-03-03 Thread Bruce Momjian
Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > So if I understand:
> 
> > * we add switches to pg_dump to dump out separate files with --pre,
> > --post and --data (or other names) [TODO: Simon]
> 
> > * we add switches to pg_restore to load/dump from the single archive
> > file the subsets of --pre, --post, --data [TODO: Magnus]
> 
> Oh dear.  Apparently you do not understand that *these are the same
> switches*.  The way pg_dump is structured, there is only one
> implementation to be done for both cases.

Uh, exactly what switches need to be added?  Just the pg_restore switch
to tell pg_restore how many concurrent sessions to run?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] pg_dump additional options for performance

2008-03-03 Thread Bruce Momjian
Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > I've not been advocating improving pg_restore, which is where the -Fc
> > tricks come in.
> > ...
> > I see you thought I meant pg_restore. I don't thinking extending
> > pg_restore in that way is of sufficiently generic use to make it
> > worthwhile. Extending psql would be worth it, since not all psql scripts
> > come from pg_dump.
> 
> OK, the reason I didn't grasp what you are proposing is that it's insane.
> 
> We can easily, and backwards-compatibly, improve pg_restore to do
> concurrent restores.  Trying to make psql do something like this will
> require a complete rewrite, and there is no prospect that it will work
> for any input that didn't come from (an updated version of) pg_dump
> anyway.  Furthermore you will have to write a whole bunch of new code
> just to duplicate what pg_dump/pg_restore already do, ie store/retrieve
> the TOC and dependency info in a program-readable fashion.
> 
> Since the performance advantages are still somewhat hypothetical,
> I think we should reach for the low-hanging fruit first.  If concurrent
> pg_restore really does prove to be the best thing since sliced bread,
> *then* would be the time to start thinking about whether it's possible
> to do the same thing in less-constrained scenarios.

Added to TODO based on this discussion:

o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping
  multiple objects simultaneously

  The difficulty with this is getting multiple dump processes to
  produce a single dump output file.
  http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php

o Allow pg_restore to utilize multiple CPUs and I/O channels by
  restoring multiple objects simultaneously

  This might require a pg_restore flag to indicate how many
  simultaneous operations should be performed.  Only pg_dump's
  -Fc format has the necessary dependency information.

o To better utilize resources, restore data, primary keys, and
  indexes for a single table before restoring the next table

  Hopefully this will allow the CPU-I/O load to be more uniform
  for simultaneous restores.  The idea is to start data restores
  for several objects, and once the first object is done, to move
  on to its primary keys and indexes.  Over time, simultaneous
  data loads and index builds will be running.

o To better utilize resources, allow pg_restore to check foreign
  keys simultaneously, where possible
o Allow pg_restore to create all indexes of a table
  concurrently, via a single heap scan

  This requires a pg_dump -Fc file because that format contains
  the required dependency information.
  http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php

o Allow pg_restore to load different parts of the COPY data
  simultaneously

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-03-03 Thread Bruce Momjian

Heikki, are there any TODO items here?

---

Heikki Linnakangas wrote:
> Joshua D. Drake wrote:
> > On Sun, 24 Feb 2008 00:43:18 +
> > "Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:
> >  
> >> Incidentally, I've been working on a patch to speed up CopyReadLine.
> >> I was going to run some more tests first, but since we're talking
> >> about it, I guess I should just post the patch. I'll post to
> >> pgsql-patches shortly.
> > 
> > On your post to patches you mentioned only about a 5% improvement.
> > Don't get me wrong, 5% is 5% and I respect it greatly but as far as I
> > can tell we are about 300% behind the curve.
> 
> Yeah. Looking at the profile, the time is spent really all over the 
> place. There's no one clear bottleneck to focus on. I think we could do 
> a few more ~5% improvements, but
> 
> At some point, I think we have to bite the bullet and find a way to use 
> multiple CPUs for a single load. I don't have any good ideas or plans 
> for that, but hopefully someone does.
> 
> > My tests were maxing out at ~22G an hour. On hardware that can do 
> > in 360G an hour and that is assuming > 50% overhead between OS, libs,
> > etc... I have no choice but to conclude we have a much, much deeper and
> > fundamental issue going on with COPY. I am inspired by Itagaki Takahiro
> > and his batch update of indexes which should help greatly overall but
> > doesn't help my specific issue.
> 
> Yep, the index build idea is an I/O improvement, not a CPU one.
> 
> > Forgive me for not being a C programmer and Alvaro is not online so I
> > would vet these questions with him first.
> > 
> > I know that copy is in theory a bulk loader but, when performing the
> > readline how many lines are we reading?  Do we read up to 8192? Or do we
> > shove in say 8megs of data before we invoke DoCopy?
> 
> We read 64 KB at a time, and then CopyReadLineText returns one line at a 
> time from that buffer.
> 
> Looking at your profile more, and after the memchr patch, the "raw input 
> side" of copy, consisting of reading the data from disk in 64KB blocks, 
> splitting that into lines, and splitting lines into columns, still takes 
> ~20% of the CPU time. I suspect CopyReadAttributesText is the biggest 
> culprit there.
> 
> You could avoid the ~8% spent in XLogInsert in PostgreSQL 8.3, by 
> creating the table (or truncating it) in the same transaction with the COPY.
> 
> After that, heap_formtuple is high on the list. I wonder if we could do 
> something about that.
> 
> > I am just curious if there is some simple low hanging fruit that is
> > possibly missing.
> 
> I don't see any piece of code that's causing problems. We can shave off 
> a few percents here and there I think, but don't expect a 300% 
> improvement anytime soon. A few ideas I've thought about are:
> 
> - use a specialized version of strtol, for base 10. That won't help on 
> your table, but I've seen strtol consume a significant amount of time on 
> tables with numeric/integer columns.
> 
> - Instead of pallocing and memcpying the text fields, leave a little bit 
> of room between fields in the attribute_buf, and write the varlen header 
> there directly. This might help you since your table has a lot of text 
> fields.
> 
> - Instead of the normal PG function calling conventions, provide 
> specialized fastpath input functions for the most common data types. 
> InputFunctionCall consumed 4.5% of the CPU time in your profile.
> 
> - Use a simpler memory context implementation, that's like a stack with 
> no pfree support, for the per-tuple context.
> 
> -- 
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] pg_dump additional options for performance

2008-03-03 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Added to TODO based on this discussion:

> o To better utilize resources, restore data, primary keys, and
>   indexes for a single table before restoring the next table

That idea seems quite misguided, and certainly was not anywhere in the
prior discussion.  If you add an ordering constraint as strong as that,
you'll mostly just eliminate chances for parallelism.  Also, the
following para describes something completely different:

>   Hopefully this will allow the CPU-I/O load to be more uniform
>   for simultaneous restores.  The idea is to start data restores
>   for several objects, and once the first object is done, to move
>   on to its primary keys and indexes.  Over time, simultaneous
>   data loads and index builds will be running.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] pg_dump additional options for performance

2008-03-03 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Added to TODO based on this discussion:
> 
> > o To better utilize resources, restore data, primary keys, and
> >   indexes for a single table before restoring the next table
> 
> That idea seems quite misguided, and certainly was not anywhere in the
> prior discussion.  If you add an ordering constraint as strong as that,
> you'll mostly just eliminate chances for parallelism.

The idea comes from this email, which no one objected to:

http://search.postgresql.org/search?m=1&q=pg_dump+create+copy&l=&d=&s=

You could avoid the ~8% spent in XLogInsert in PostgreSQL 8.3, by
creating the table (or truncating it) in the same transaction with the
COPY. 

In the same transaction? Oh that's interesting. So that might be a TODO
right there. Change pg_dump so it does:

create,copy,create,copy,index

Instead of:

create,create,copy,copy,index


> Also, the
> following para describes something completely different:
> 
> >   Hopefully this will allow the CPU-I/O load to be more uniform
> >   for simultaneous restores.  The idea is to start data restores
> >   for several objects, and once the first object is done, to move
> >   on to its primary keys and indexes.  Over time, simultaneous
> >   data loads and index builds will be running.

OK, let me remove the entire item and we can always re-add it if someone
want to make a case for it.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2008-03-03 Thread Brendan Jurd
A quick recap: I submitted a patch for empty ARRAY[] syntax back in
November, and as far as I can see it never made it to the patches
list.  Gregory suggested a different way of approaching the problem
(quoted below), but nobody commented further about how it might be
made to work.

I'd like to RFC again on Gregory's idea, and if that doesn't bear any
fruit I'd like to submit the patch as-is for review.

Regards,
BJ

On 01/12/2007, Brendan Jurd <[EMAIL PROTECTED]> wrote:
> On Nov 30, 2007 9:09 PM, Gregory Stark <[EMAIL PROTECTED]> wrote:
>  > I'm sorry to suggest anything at this point, but... would it be less 
> invasive
>  > if instead of requiring the immediate cast you created a special case in 
> the
>  > array code to allow a placeholder object for "empty array of unknown type".
>  > The only operation which would be allowed on it would be to cast it to some
>  > specific array type.
>  >
>  > That way things like
>  >
>  > UPDATE foo SET col = array[];
>  > INSERT INTO foo (col) VALUES (array[]);
>  >
>  > could be allowed if they could be contrived to introduce an assignment 
> cast.
>
>  Not sure it would be less invasive, but I do like the outcome of being
>  able to create an empty array pending assignment.  In addition to your
>  examples, it might also make it possible to do things like this in
>  plpgsql
>
>  DECLARE
>   a text[] := array[];
>
>  Whereas my patch requires you to write
>
>   a text[]: =array[]::text[];
>
>  ... which seems pretty stupid.
>
...
>  Any suggestions about how you would enforce the "only allow casts to
>  array types" restriction on the empty array?
>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


Re: [HACKERS] pg_dump additional options for performance

2008-03-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 3 Mar 2008 20:33:08 -0500 (EST)
Bruce Momjian <[EMAIL PROTECTED]> wrote:

> Added to TODO based on this discussion:
> 
> o Allow pg_dump to utilize multiple CPUs and I/O channels by
> dumping multiple objects simultaneously
> 
>   The difficulty with this is getting multiple dump processes
> to produce a single dump output file.
>   http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php

Isn't part of this problem also a consistent snapshot?

Joshua D. Drake


- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHzOFmATb/zqfZUUQRApqJAJ0ScecOzmcWqy86j/nfeMazL6JeZgCfWynT
xJ9EbtlMqSsYi6NNXTjaY7U=
=SI0h
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers


[HACKERS] Replacing the corrupt "global" folder with older one

2008-03-03 Thread chris.jurado
Sorry for sending this directly to the hackers mailing list. But, i think it 
did not belong in any other as it involves internals about the files in the 
data directory.


PostgreSQL version: 8.1.2
Operating system:   Windows XP/2003

The PostgreSQL service is not starting anymore. When I manually start it, it 
said something like it started but ended immediately because it had no work to 
do. I encountered this error before but a simple re-installation fixed it. This 
time, it didn't.

I've tried running the service using a backup data folder w/c was 2 days ago, 
it runs. When I switch back to the current, I get the error above. So, I 
figured the current data must be corrupt. After examination of the data folder, 
the global folder is now not a folder but a file that is 8KB in size (I found 
this out by trying out the reset_xlog command and it said it could no longer 
find the global/xxx file).  It could be the hard disk has a problem. I'm going 
to replace it soon.

I searched the documentation about what the global folder contains and found 
out that these contain cluster-wide tables like the list of databases, etc.

I wanted not to lose the latest transactions by restoring a backup. Now the 
question is this, is it ok to just copy the global folder from my backup 2 days 
ago, and replace the one in my current data folder? I'm very very sure no DDL 
statements were executed or no new databases/other objects were created since 
my last backup. Only DML statements were executed on the user's databases. Will 
I lose data if I do this? 





Sent via the WebMail system at primesoft.ph


 
   

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers