date:20070220

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Oleg Bartunov wrote:
>> It's not so big addition to the gram.y, see a list of commands
>> http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html.

> I looked at the diff file and the major change in gram.y is the creation
> of a new object type FULLTEXT,

You mean four different object types.  I'm not totally clear on bison's
scaling behavior relative to the number of productions, but I think
there's no question that this patch will impose a measurable distributed
penalty on every single query issued to Postgres by any application,
whether it's heard of tsearch or not.  The percentage overhead would
be a lot lower if the patch were introducing a similar number of entries
into pg_proc.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread August Zajonc

RPK wrote:
> Andrew,
> 
>>> Demanding unlimited undo at some time that is arbitrarilly distant in the
>>> future strikes me as wholly unreasonable. 
> 
> I did not mean asking for undo from a life-time log. Since FlashBack
> Technology is already there, I just mean that world's most advanced database
> (PostgreSQL, as they say), must have an optimized way for undoing of at
> least a week changes. A week log is enough and PostgreSQL can keep on
> removing old logs automatically.
> 
> Secondly, it must be left to the user to decide for the number of days of
> archive he want to store. Again upto a week max.

You might look at storing delta's or similar (perhaps with a check table
) if you need to change data a week back. Then you can just find the row
representing the problematic change and delete it. If you really want to
track what happens, do deltas and then instead of deleting them, put a
reversing delta in, keeping your entire audit trail.

You can put materialized views on top of this if you need performance.

- August

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Joshua D. Drake





It's not so big addition to the gram.y, see a list of commands
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html.
SQL commands make FTS syntax clear and follow tradition to manage
system objects. From the user's side, I'd be very unhappy to configure
FTS, which can be very complex, using functions.  All we want is to 
provide users clear syntax.
This is like the third time we have been around this problem. The syntax 
is clear and reasonable imo.
Can we stop arguing about it and just include? If there are specific 
issues beyond syntax that is one

thing, but that this point it seems we are arguing for the sake of arguing.

Joshua D. Drake


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] timestamp subtraction (was Re: [SQL] formatting intervals with to_char)

2007-02-20 Thread Bruce Momjian


One problem with removing justify_hours() is that this is going to
return '24:00:00', rather than '1 day:

test=> select '2004-01-02 00:00:00'::timestamptz - '2004-01-01
00:00:00'::timestamptz;
 ?column?
--
 24:00:00
(1 row)

---

Jim Nasby wrote:
> On Oct 5, 2006, at 11:50 AM, Tom Lane wrote:
> > regression=# select ('2006-09-15 23:59:00'::timestamp - '2006-09-01  
> > 09:30:41'::timestamp);
> >  ?column?
> > --
> >  14 days 14:28:19
> > (1 row)
> >
> > should be reporting '350:28:19' instead.
> >
> > This is a hack that was done to minimize the changes in the regression
> > test expected outputs when we changed type interval from months/ 
> > seconds
> > to months/days/seconds.  But I wonder whether it wasn't a dumb idea.
> > It is certainly inconsistent, as noted in the code comments.
> >
> > I'm tempted to propose that we remove the justify_hours call, and tell
> > anyone who really wants the old results to apply justify_hours() to  
> > the
> > subtraction result for themselves.  Not sure what the fallout would  
> > be,
> > though.
> 
> I suspect there's applications out there that are relying on that  
> being nicely formated for display purposes.
> 
> I agree it should be removed, but we might need a form of backwards  
> compatibility for a version or two...
> --
> Jim Nasby[EMAIL PROTECTED]
> EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)
> 
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Column storage positions

2007-02-20 Thread Robert Treat

On Tuesday 20 February 2007 16:07, Phil Currier wrote:
> Another problem relates to upgrades.  With tools like pg_migrator now
> on pgfoundry, people will eventually expect quick upgrades that don't
> require rewriting each table's data.  Storage positions would cause a
> problem for every version X -> version Y upgrade with Y >= 8.3, even
> when X is also >= 8.3, because a version X table could always have
> been altered without a rewrite into a structure different from what
> Y's CREATE TABLE will choose.  

If you are using pg_migrator your not going to be moving the datafiles on disk 
anyway,so pg_migrator's behavior shouldnt change terribly.  If your doing 
pg_dump based upgrade, presumably pg_dump could write it's create statements 
with the columns in attstorpos order and set attnum = attstorpos, preserving 
the physical layout from the previous install.

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Column storage positions

2007-02-20 Thread Sergey E. Koposov



Just as my 2 cents to the proposed idea. 
I want to demonstrate that the proposed idea is very relevant for the

performance.

I recently did an migration from PG 8.1 to PG 8.2. During that time I was 
dumping the 2TB database with several very wide tables (having ~ 200 
columns). And I saw that on my pretty powerful server with 8Gb 
RAM, Itanium2 procesor,large RAID which can do I/O at 100Mb/sec the 
performance of pg_dump was CPU limited, and the read speed of the tables 
was 1-1.5mb/sec (leading to 2 week dumping time).


I was very surprised by these times, and profiled postgres to check the 
reason of that:

here is the top of gprof:
  %   cumulative   self  self total
 time   seconds   secondscalls   s/call   s/call  name
 60.72 13.5213.52  6769826 0.00 0.00  nocachegetattr
 10.58 15.88 2.36  9035566 0.00 0.00  CopyAttributeOutText
  7.22 17.49 1.61 65009457 0.00 0.00  CopySendData
  6.34 18.90 1.411 1.4122.21  CopyTo

So the main slow-down of the process was all this code recomputing the 
boundaries of the columns I checked that by removing one tiny varchar 
column and COALESCING all NULLs, and after that the performance of 
pg_dumping increased by more than a factor of 2!


I should have reported that experience earlier... but I hope that my 
observations can be useful in the context of the Phil's idea.


regards,
Sergey

***
Sergey E. Koposov
Max Planck Institute for Astronomy/Cambridge Institute for Astronomy/Sternberg 
Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

[HACKERS] --enable-xml instead of --with-libxml?

2007-02-20 Thread Nikolay Samokhvalov


Now we have --with-libxml (and USE_LIBXML constant for #ifdef-s), what
is not absolutely right: XML support is smth that is more general than
"using libxml2 library". E.g., some SQL/XML publishing functions (such
as XMLPI) do not deal with libxml2.

Also, in the future more "non-libxml" functionality could be added to
"XML support" (well, Peter's recent SQL-to-XML mapping functions prove
it).

I think it'd better to rename configure option to --enable-xml and
USE_LIBXML to ENABLE_XML. I'll do it if there are no objections.

--
Best regards,
Nikolay

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] PrivateRefCount (for 8.3)

2007-02-20 Thread Bruce Momjian


Added to TODO:

* Consider decreasing the amount of memory used by PrivateRefCount

  http://archives.postgresql.org/pgsql-hackers/2006-11/msg00797.php
  http://archives.postgresql.org/pgsql-hackers/2007-01/msg00752.php


---

Simon Riggs wrote:
> On Mon, 2006-11-27 at 14:42 -0500, Bruce Momjian wrote:
> > Simon Riggs wrote:
> > > int8 still seems like overkjll. When will the ref counts go above 2 on a
> > > regular basis? Surely refcount=2 is just chance at the best of times.
> > > 
> > > Refcount -> 2 bits per value, plus a simple overflow list? That would
> > > allow 0,1,2 ref counts plus 3 means look in hashtable to find real
> > > refcount.
> > 
> > At two bits, would we run into contention for the byte by multiple
> > backends?
> 
> No contention, its a private per-backend data structure. That's why we
> want to reduce the size of it so badly.
> 
> -- 
>   Simon Riggs 
>   EnterpriseDB   http://www.enterprisedb.com
> 
> 
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] fixing Makefile.shlib for solaris/gcc with -m64 flag

2007-02-20 Thread Bruce Momjian


OK, I have made Solaris gcc the same as Linux in Makefile.shlib, patch
attached.

I am not backpatching this.  We will get this tested for 8.3.

---

Jignesh K. Shah wrote:
> I dont think we solved this.. But I think the way to put -m64 should be same 
> as in Linux and Solaris 
> and not different.
> 
> Thanks.
> Regards,
> Jignesh
> 
> 
> Tom Lane wrote:
> > Peter Eisentraut <[EMAIL PROTECTED]> writes:
> >> Am Mittwoch, 17. Januar 2007 17:12 schrieb Tom Lane:
> >>> "Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
>  simple if I use -m64 for 64 bit then all end binaries are generated
>  64-bit and the shared libraries are generated 32-bit and the compilation
>  fails (ONLY ON SOLARIS) since that particular line is only for the
>  condition Solaris AND gcc.
> 
>  If I use the COMPILER which is CC + CFLAGS it passes -m64 properly to it
>  and generates shared libraries 64-bit and the compile continues..
> >>> Hmm ... I see we're doing it that way already for some other platforms,
> >>> but I can't help thinking it's a kluge.  Wouldn't the correct answer be
> >>> that -m64 needs to be in LDFLAGS?
> > 
> >> The correct answer may be to put -m64 into CC.
> > 
> > Did we conclude that that was a satisfactory solution, or is this still
> > a live patch proposal?
> > 
> > If -m64 in CC is the right solution, it should probably be mentioned in
> > FAQ_Solaris.
> > 
> > regards, tom lane
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/Makefile.shlib
===
RCS file: /cvsroot/pgsql/src/Makefile.shlib,v
retrieving revision 1.108
diff -c -c -r1.108 Makefile.shlib
*** src/Makefile.shlib	11 Feb 2007 19:31:45 -	1.108
--- src/Makefile.shlib	20 Feb 2007 22:45:06 -
***
*** 191,197 
  
  ifeq ($(PORTNAME), solaris)
ifeq ($(GCC), yes)
! LINK.shared		= $(CC) -shared
else
  # CFLAGS added for X86_64
  LINK.shared		= $(CC) -G $(CFLAGS)
--- 191,197 
  
  ifeq ($(PORTNAME), solaris)
ifeq ($(GCC), yes)
! LINK.shared		= $(COMPILER) -shared # $(COMPILER) needed for -m64
else
  # CFLAGS added for X86_64
  LINK.shared		= $(CC) -G $(CFLAGS)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [pgsql-patches] pg_get_domaindef

2007-02-20 Thread Andrew Dunstan


Bruce Momjian wrote:

I always felt is was better for us to have server functions that return
schema-specific data rather than require every application to define its
own functions.  I realize they are duplicated in pg_dump, but even if we
made an external library that pg_dump could share with applications,
would it only be available to C applications?  That seems quite
limiting.
  


I don't think so.

I imagine that the maintainers of DBD::Pg and phppgadmin, for example, 
would be very likely to expose them.


And I can certainly imagine using/exposing them in some psql slash commands.

cheers

andrew



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] DROP FUNCTION failure: cache lookup failed for relation X

2007-02-20 Thread Bruce Momjian


Added to TODO:

* Increase locking when DROPing objects so dependent objects cannot
  get dropped while the DROP operation is happening

  http://archives.postgresql.org/pgsql-hackers/2007-01/msg00937.php



---

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Uh, where are we on this?
> 
> Still in the think-about-it mode, personally ... my proposed fix is
> certainly much too invasive to consider back-patching, so unless someone
> comes up with a way-simpler idea, it's 8.3 material at best ...
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Bruce Momjian

Oleg Bartunov wrote:
> On Tue, 20 Feb 2007, Alvaro Herrera wrote:
> 
> > Bruce Momjian wrote:
> >>
> >> FYI, I added this to the patches queue because I think we decided
> >> full-text indexing should be in the core.  If I am wrong, please let me
> >> know.
> >
> > One of the objections I remember to this particular implementation was
> > that configuration should be done using functions rather than new syntax
> > in gram.y.  This seems a good idea because it avoids bloating the
> > grammar, while still allowing dependency tracking, pg_dump support,
> > syscache support etc.
> 
> It's not so big addition to the gram.y, see a list of commands
> http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html.
> SQL commands make FTS syntax clear and follow tradition to manage
> system objects. From the user's side, I'd be very unhappy to configure
> FTS, which can be very complex, using functions.  All we want is to 
> provide users clear syntax.

I looked at the diff file and the major change in gram.y is the creation
of a new object type FULLTEXT, so you can CREATE, ALTER and DROP
FULLTEXT.

I don't know fulltext administration well enough, so if Oleg says a
function API would be too complex, I am OK with his new parser syntax.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Oleg Bartunov


On Tue, 20 Feb 2007, Alvaro Herrera wrote:


Bruce Momjian wrote:


FYI, I added this to the patches queue because I think we decided
full-text indexing should be in the core.  If I am wrong, please let me
know.


One of the objections I remember to this particular implementation was
that configuration should be done using functions rather than new syntax
in gram.y.  This seems a good idea because it avoids bloating the
grammar, while still allowing dependency tracking, pg_dump support,
syscache support etc.


It's not so big addition to the gram.y, see a list of commands
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html.
SQL commands make FTS syntax clear and follow tradition to manage
system objects. From the user's side, I'd be very unhappy to configure
FTS, which can be very complex, using functions.  All we want is to 
provide users clear syntax.



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] msvc failure in largeobject regression test

2007-02-20 Thread Bruce Momjian


Was this problem addressed?

---

Magnus Hagander wrote:
> On Tue, Jan 23, 2007 at 11:39:23AM -0800, Jeremy Drake wrote:
> > On Tue, 23 Jan 2007, Magnus Hagander wrote:
> > 
> > > On Tue, Jan 23, 2007 at 09:31:40AM -0500, Andrew Dunstan wrote:
> > > > Magnus Hagander wrote:
> > > > >Hi!
> > > > >
> > > > >I get failures for the largeobject regression tests on my vc++ build. I
> > > > >don't think this has ever worked, given that those tests are fairly 
> > > > >new.
> > > > >Any quick ideas on what's wrong before I dig deeper?
> > > > >
> > > > >
> > > > [snip]
> > > >
> > > > I wonder if this is a line-end issue? Assuming you are working from CVS,
> > > > does your client turn \n into \r\n ? I see that other windows boxes are
> > > > happily passing this test on the buildfarm, and of course the mingw cvs
> > > > doesn't adjust line endings.
> > >
> > > Bingo!
> > >
> > > That's it. I copeid the file in binary mode from a linux box and now it
> > > passes.
> > 
> > I thought about that when I wrote it, and thus tried it under mingw and
> > cygwin without issue ;)  I don't think the regression tests were in a
> > position of running on the msvc build at the time...  My thought for what
> > to do if this did run into a problem would be an alternate output file
> > that is also acceptable (I don't know what they're called but other tests
> > have them IIRC).
> 
> Either that, or we require a checkout using Unix style linefeeds. I've
> confirmed that removing the file and checking it back out with "cvs --lf
> update tenk.data" works - tests pass fine.
> 
> Yet another option might be to flag that file as binary in cvs, in which
> case I think cvsnt shouldn't go mess with it.
> 
> //Magnus
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [pgsql-patches] pg_get_domaindef

2007-02-20 Thread Bruce Momjian


I always felt is was better for us to have server functions that return
schema-specific data rather than require every application to define its
own functions.  I realize they are duplicated in pg_dump, but even if we
made an external library that pg_dump could share with applications,
would it only be available to C applications?  That seems quite
limiting.

Of course, if people don't need these functions, then we shouldn't have
them.

Seems we have to decide on this one so we can update the TODO or apply
the patch.

---

Tom Lane wrote:
> Andrew Dunstan <[EMAIL PROTECTED]> writes:
> > FAST PostgreSQL wrote:
> >> Please find attached the patch with modifications
> 
> > are you proposing to implement the other functions in this TODO item 
> > (pg_get_acldef(), pg_get_typedefault(), pg_get_attrdef(), 
> > pg_get_tabledef(), pg_get_functiondef() ) ?
> 
> I haven't entirely understood the use case for any of these.  It's not
> pg_dump, for a number of reasons: one being that pg_dump still has to
> support older backend versions, and another being that every time we
> let backend SnapshotNow functions get involved, we take another hit to
> pg_dump's claim to produce a consistent MVCC snapshot.
> 
> But my real objection is: do we really want to support duplicative code
> in both pg_dump and the backend?  Updating pg_dump is already a major
> PITA whenever one adds a new feature; doubling that work isn't
> attractive.  (And it'd be double, not just a copy-and-paste, because of
> the large difference in the operating environment.)  So I want to hear a
> seriously convincing use-case that will justify the maintenance load we
> are setting up for ourselves.  "Somebody might want this" is not
> adequate.
> 
> Perhaps a better area of work would be the often-proposed refactoring of
> pg_dump into a library and driver program, wherein the library could
> expose individual functions such as "fetch the SQL definition of this
> object".  Unfortunately, that'll be a huge project with no payoff until
> the end...
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Alvaro Herrera

Bruce Momjian wrote:
> 
> FYI, I added this to the patches queue because I think we decided
> full-text indexing should be in the core.  If I am wrong, please let me
> know.

One of the objections I remember to this particular implementation was
that configuration should be done using functions rather than new syntax
in gram.y.  This seems a good idea because it avoids bloating the
grammar, while still allowing dependency tracking, pg_dump support,
syscache support etc.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] No ~ operator for box, point

2007-02-20 Thread Bruce Momjian


Added to TODO:

* Add missing operators for geometric data types

  Some geometric types do not have the full suite of geometric 
operators,
  e.g. box @> point


---

Jim Nasby wrote:
> * Add missing operators for geometric data types and operators
> 
>  There are geometric data types that do not have the full suite  
> of geometric operators
>  defined; for example, box @> point does not exist.
> 
> On Jan 26, 2007, at 9:32 PM, Bruce Momjian wrote:
> 
> >
> > Can I get a TODO on this?
> >
> > -- 
> > -
> >
> > Jim Nasby wrote:
> >> On Jan 25, 2007, at 6:26 PM, Tom Lane wrote:
> >>> Martijn van Oosterhout  writes:
>  On Thu, Jan 25, 2007 at 01:59:33PM -0500, Merlin Moncure wrote:
> > On 1/25/07, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> >> decibel=# select box '((0,0),(2,2))' ~ point '(1,1)';
> >> ERROR:  operator does not exist: box ~ point
> >
> > I don't see a reason, although you can do it with polygon and not
> > box.
> >>>
> >>> Seems like an old oversight.
> >>
> >> Ok. If I ever get some time I'll submit a patch to bring everything
> >> in-line (there's other missing operators as well).
> >>
> > Also, I can't find the ~ operator defined for polygon in the
> > documentation, am I missing something?
> >>>
> >>> ~ is deprecated, "contains" is preferentially spelled @> now.
> >>
> >> Ok, I'll keep that in mind.
> >> --
> >> Jim Nasby[EMAIL PROTECTED]
> >> EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)
> >>
> >>
> >>
> >> ---(end of  
> >> broadcast)---
> >> TIP 7: You can help support the PostgreSQL project by donating at
> >>
> >> http://www.postgresql.org/about/donate
> >
> > -- 
> >   Bruce Momjian   [EMAIL PROTECTED]
> >   EnterpriseDBhttp://www.enterprisedb.com
> >
> >   + If your life is a hard drive, Christ can be your backup. +
> >
> 
> --
> Jim Nasby[EMAIL PROTECTED]
> EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Bruce Momjian


FYI, I added this to the patches queue because I think we decided
full-text indexing should be in the core.  If I am wrong, please let me
know.

---

Teodor Sigaev wrote:
> We (Oleg and me) are glad to present tsearch in core of pgsql patch. In 
> basic, 
> layout, functions, methods, types etc are the same as in current tsearch2 
> with a 
> lot of improvements:
> 
>   - pg_ts_* tables now are in pg_catalog
>   - parsers, dictionaries, configurations now have owner and namespace 
> similar to
> other pgsql's objects like tables, operator classes etc
>   - current tsearch configuration is managed with a help of GUC variable
> tsearch_conf_name.
>   - choosing of tsearch cfg by locale may be done for each schema separately
>   - managing of tsearch configuration with a help of SQL commands, not with
> insert/update/delete statements. This allows to drive dependencies,
> correct dumping and dropping.
>   - psql support with a help of \dF* commands
>   - add all available Snowball stemmers and corresponding configuration
>   - correct memory freeing by any dictionary
> 
> Work is sponsored by EnterpriseDB's PostgreSQL Development Fund.
> 
> patch: http://www.sigaev.ru/misc/tsearch_core-0.33.gz
> docs: http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ (not yet completed and 
> it's 
> not yet a patch, just a SGML source)
> 
> Implementation details:
> - directory layout
>src/backend/utils/adt/tsearch - all IO function and simple operations
>src/backend/utils/tsearch - complex processing functions, including
> language processing and dictionaries
> - most of snowball dictionaries are placed in separate .so library and
>they plug in into data base by similar way as character conversation
>library does.
> 
> If there aren't objections then we plan commit patch tomorrow or after 
> tomorrow.
> Before committing, I'll changes oids from 5000+ to lower values to prevent 
> holes 
> in oids. And after that, I'll remove tsearch2 contrib module.
> 
> -- 
> Teodor Sigaev   E-mail: [EMAIL PROTECTED]
> WWW: http://www.sigaev.ru/
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Bruce Momjian


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Teodor Sigaev wrote:
> We (Oleg and me) are glad to present tsearch in core of pgsql patch. In 
> basic, 
> layout, functions, methods, types etc are the same as in current tsearch2 
> with a 
> lot of improvements:
> 
>   - pg_ts_* tables now are in pg_catalog
>   - parsers, dictionaries, configurations now have owner and namespace 
> similar to
> other pgsql's objects like tables, operator classes etc
>   - current tsearch configuration is managed with a help of GUC variable
> tsearch_conf_name.
>   - choosing of tsearch cfg by locale may be done for each schema separately
>   - managing of tsearch configuration with a help of SQL commands, not with
> insert/update/delete statements. This allows to drive dependencies,
> correct dumping and dropping.
>   - psql support with a help of \dF* commands
>   - add all available Snowball stemmers and corresponding configuration
>   - correct memory freeing by any dictionary
> 
> Work is sponsored by EnterpriseDB's PostgreSQL Development Fund.
> 
> patch: http://www.sigaev.ru/misc/tsearch_core-0.33.gz
> docs: http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ (not yet completed and 
> it's 
> not yet a patch, just a SGML source)
> 
> Implementation details:
> - directory layout
>src/backend/utils/adt/tsearch - all IO function and simple operations
>src/backend/utils/tsearch - complex processing functions, including
> language processing and dictionaries
> - most of snowball dictionaries are placed in separate .so library and
>they plug in into data base by similar way as character conversation
>library does.
> 
> If there aren't objections then we plan commit patch tomorrow or after 
> tomorrow.
> Before committing, I'll changes oids from 5000+ to lower values to prevent 
> holes 
> in oids. And after that, I'll remove tsearch2 contrib module.
> 
> -- 
> Teodor Sigaev   E-mail: [EMAIL PROTECTED]
> WWW: http://www.sigaev.ru/
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

[HACKERS] Column storage positions

2007-02-20 Thread Phil Currier


Inspired by this thread [1], and in particular by the idea of storing
three numbers (permanent ID, on-disk storage position, display
position) for each column, I spent a little time messing around with a
prototype implementation of column storage positions to see what kind
of difference it would make.  The results were encouraging: on a table
with 20 columns of alternating smallint and varchar(10) datatypes,
selecting the max() of one of the rightmost int columns across 1
million rows ran around 3 times faster.  The same query on the
leftmost varchar column (which should suffer the most from this
change) predictably got a little slower (about 10%); I couldn't
measure a performance drop on the rightmost varchar columns.  The
table's size didn't drop much in this case, but a different table of
20 alternating int and smallint columns showed a 20% slimmer disk
footprint, pretty much as expected.  Pgbenching showed no measurable
difference, which isn't surprising since the pgbench test tables
consist of just int values with char filler at the end.

So here is a proposal for separating a column's storage position from
its permanent ID.  I've ignored the display position piece of the
original thread because display positions don't do much other than
save you the hassle of creating a view on top of your table, while
storage positions have demonstrable, tangible benefits.  And there is
no reason to connect the two features; display positions can easily be
added separately at a later point.

We want to decouple a column's on-disk storage position from its
permanent ID for two reasons: to minimize the space lost to alignment
padding between fields, and to speed up access to individual fields.
The system will automatically assign new storage positions when a
table is created, and when a table alteration requires a rewrite
(currently just adding a column with a default, or changing a column
datatype).  To allow users to optimize tables based on the fields they
know will be frequently accessed, I think we should extend ALTER TABLE
to accept user-assigned storage positions (something like "ALTER TABLE
ALTER col SET STORAGE POSITION X").  This command would also be useful
for another reason discussed below.

In my prototype, I used these rules to determine columns' storage order:
1) fixed-width fields before variable-width, dropped columns always last
2) fixed-width fields ordered by increasing size
3) not-null fields before nullable fields
There are other approaches worth considering - for example, you could
imagine swapping the priority of rules 2 and 3.  Resultant tables
would generally have more alignment waste, but would tend to have
slightly faster field access.  I'm really not sure what the optimal
strategy is since every user will have a slightly different metric for
"optimal".  In any event, either of these approaches is better than
the current situation.

To implement this, we'll need a field (perhaps attstoragepos?) in
pg_attribute to hold the storage position.  It will equal attnum until
it is explicitly reassigned.  The routines in heaptuple.c need to
quickly loop through the fields of a tuple in storage order rather
than attnum order, so I propose extending TupleDesc to hold an
"attrspos" array that sits alongside the attrs array.  In the
prototype I used an array of int2 indices into the attrs array,
ordered by storage position.

These changes cause a problem in ExecTypeFromTLInternal: this function
calls CreateTemplateTupleDesc followed by TupleDescInitEntry, assuming
that attnum == attstoragepos for all tuples.  With the introduction of
storage positions, this of course will no longer be true.  I got
around this by having expand_targetlist, build_physical_tlist, and
build_relation_tlist make sure each TargetEntry (for targetlists
corresponding to either insert/update tuples, or base tuples pulled
straight from the heap) gets a correct resorigtbl and resname.  Then
ExecTypeFromTLInternal first tries calling a new function
TupleDescInitEntryAttr, which hands off to TupleDescInitEntry and then
performs a syscache lookup to update the storage position using the
resorigtbl.  This is a little ugly because ExecTypeFromTLInternal
doesn't know in advance what kind of tupledesc it's building, so it
needs to retreat to the old method whenever the syscache lookup fails,
but it was enough to pass the regression tests.  I could use some
advice on this - there's probably a better way to do it.

Another problem relates to upgrades.  With tools like pg_migrator now
on pgfoundry, people will eventually expect quick upgrades that don't
require rewriting each table's data.  Storage positions would cause a
problem for every version X -> version Y upgrade with Y >= 8.3, even
when X is also >= 8.3, because a version X table could always have
been altered without a rewrite into a structure different from what
Y's CREATE TABLE will choose.  I don't think it's as simple as just
using the above-mentioned ALTER TABLE extension to a

Re: [HACKERS] Modifying and solidifying contrib

2007-02-20 Thread Bruce Momjian


Are we doing this?

---

Joshua D. Drake wrote:
> Hello,
> 
> With all the recent discussion on contrib modules etc.. I would like to
> offer the following suggestion. I am willing to do a good portion of the
> work myself and I can get it done before feature freeze. I will need
> help with the global make file stuff however so that is one dependency.
> 
> Add directory /modules
> Modules are compiled and installed by default but not enabled.
> Modules in 8.3 currently are:
>pgrowlocks
>pg_freespacemap (to be renameed pgfreespacemap to be consistent)
>pgstattuple
>pgcrypto
>xml2
>pgbuffercache
>initagg
> 
> Requirements for /modules
>Must go through normal vetting process on -hackers
>Must include patches to core documentation in Docbook
>Must include test cases? I don't recall if we have regress for all
>contrib stuff.
> 
> Keep directory contrib
> Contrib is not compiled or installed by default
> Contrib in 8.3 would be:
>start-scripts
>pgbench (which I think really should be a foundry project)
>vacuumlo (is this even required anymore?)
>adminpack
>btree_gist
>   etc...
> Requirements for /contrib
>Must go through normal vetting process on -hackers
>Must includes README
>Must include test cases? Same questions for modules
> 
> 
> Thoughts, flames?
> 
> Sincerely,
> 
> Joshua D. Drake
> 
> 
> 
> 
> 
> 
> -- 
> 
>   === The PostgreSQL Company: Command Prompt, Inc. ===
> Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
> Providing the most comprehensive  PostgreSQL solutions since 1997
>  http://www.commandprompt.com/
> 
> Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> PostgreSQL Replication: http://www.commandprompt.com/products/
> 
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Jonah H. Harris

On 2/20/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:

He probably meant longer transactions and several versions visible to
different backends.

Yes, he may have... but I was responding to the statements he made.

> but why shouldn't that be a bit slower, it isn't common practice anyway.

Not for pure OLAP, at least when you have fairly fast transactions. But
it can slow things down when you have some hotspot tables.

True, but hotspots are hotspots and no matter what caused them or
where they are, they slow down performance in one area or another.
Limiting hotspots is generally an application-level design decision
anyway.

> Same with rollbacks... why
> should they optimize for them when 97% of transactions commit?

Or other way around, - you should write code, where most transactions
commit ;)

That's what I said, Oracle shouldn't optimize for rollbacks when most
transactions commit.

Hmm. How can it check visibility at block level and at the same time do
in-place updates on single tuples ?

In most cases, the block-level SCN determines transaction-level
visibility.  Now, row locks can exist within that page, but they don't
determine visibility... they determine the UNDO location which
contains the data required to rebuild a read-consistent version of the
block.

My proposal of keeping visibility info in a separate heap would help to
get similar results, that is mostly 1 check per page. That would also
cover much of the index lookup cases below.

Most definitely.

I don't think we will ever move to rollback segments, but for some
use-cases moving visibility to a separate heap could make sense.

Yes.

And if we want to bring back time travel (see another thread about
"Flashback Queries"), then we may end up implementing the original
postgresql's design spec and make VACUUM spihon dead tuples over to
archive relations, which already starts looking a little like rollback
segments, only for other purposes :)

Yes.

Using a system extensively can also create blind spots about some of the
systems (mis)features. One learns to avoid doing some things without
consciously knowing about it.

I've used 'em all and can certainly name issues with Oracle.  However,
we're discussing improving PostgreSQL, I was responding to Greg's
statements, and I don't see the need to bring up unrelated Oracle
implementation details which will just lead to a general anti-Oracle
discussion.

To get a really meaningful discussion we should involve someone who has
*designed* them, not merely used them .

True.  My comment is more along the lines of uninformed discussion
which leads to Oracle-bashing.  Those who have at least used and
administered Oracle in production tend to understand Oracle's design
decisions and related issues better than those who have just heard of
Oracle's issues.

I live in the real world and can admit certain failures of any
database system regardless of which I prefer.  No single database is
best for every task.  I just didn't want the discussion going where it
normally goes, to being one of, "we're right and they're wrong".

Can we move offline or to another thread if we want to continue
discussing Oracle-specifics; otherwise... let's focus on
flashback-like functionality in this thread.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Jonah H. Harris


On 2/20/07, Rod Taylor <[EMAIL PROTECTED]> wrote:

Do 97% of transactions commit because Oracle has slow rollbacks and
developers are working around that performance issue, or because they
really commit?


Again, off-topic, but 97% of all transactions commit according to Jim
Gray and his research... not anything related to Oracle.


I have watched several developers that would prefer to issue numerous
selects to verify things like foreign keys in the application in
order to avoid a rollback.


That's just bad development.


Anyway, I don't have experience with big Oracle applications but I'm
not so sure that 97% of transactions would commit if rollbacks were
cheaper.


Again, stats not related to Oracle, but databases in general.


--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] statement_timeout doesnt work within plpgsql by design?

2007-02-20 Thread Robert Treat

On Tuesday 20 February 2007 12:50, Tom Lane wrote:
> Robert Treat <[EMAIL PROTECTED]> writes:
> > pagila=# create or replace function test() returns bool as $$ begin set
> > statement_timeout = 3000; perform pg_sleep(4) ; return true; end $$
> > language plpgsql;
> > CREATE FUNCTION
>
> statement_timeout is measured across an entire interactive command, not
> individual commands within a function; and the timeout that applies to
> an interactive command is determined at its beginning.  So the above
> doesn't do what you think.
>

Well, I'd be happy if it caused the entire function to bail out or if it 
caused individual statements within a function to bail out, but it does 
neither.  I can see how that would be a bit tricky to implement though. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] autovacuum next steps

2007-02-20 Thread Jim C. Nasby

I'm wondering if we can do one better...

Since what we really care about is I/O responsiveness for the rest of
the system, could we just time how long I/O calls take to complete? I
know that gettimeofday can have a non-trivial overhead, but do we care
that much about it in the case of autovac?

On Fri, Feb 16, 2007 at 05:37:26PM -0800, Ron Mayer wrote:
> Alvaro Herrera wrote:
> > 
> > Once autovacuum_naptime... autovacuum_max_workers...
> > How does this sound?
> 
> The knobs exposed on autovacuum feel kinda tangential to
> what I think I'd really want to control.
> 
> IMHO "vacuum_mbytes_per_second" would be quite a bit more
> intuitive than cost_delay, naptime, etc.
> 
> 
> ISTM I can relatively easily estimate and/or spec out how
> much "extra" I/O bandwidth I have per device for vacuum;
> and would pretty much want vacuum to be constantly
> running on whichever table that needs it the most so
> long as it can stay under that bandwith limit.
> 
> Could vacuum have a tunable that says "X MBytes/second"
> (perhaps per device) and have it measure how much I/O
> it's actually doing and try to stay under that limit?
> 
> For more fine-grained control a cron job could go
> around setting different MBytes/second limits during
> peak times vs idle times.
> 
> 
> If people are concerned about CPU intensive vacuums
> instead of I/O intensive ones (does anyone experience
> that? - another tuneable "vacuum_percent_of_cpu" would
> be more straightforward than delay_cost, cost_page_hit,
> etc.   But I'd be a bit surprised if cpu intensive
> vacuums are common.
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
> 

-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Theo Schlossnagle



On Feb 20, 2007, at 1:40 PM, Tom Lane wrote:


RPK <[EMAIL PROTECTED]> writes:

I did not mean asking for undo from a life-time log. Since FlashBack
Technology is already there, I just mean that world's most  
advanced database
(PostgreSQL, as they say), must have an optimized way for undoing  
of at

least a week changes.


You're living in a dream world.  Do you know any Oracle DBs who keep
enough rollback segments to go back a week?


Ours go for a good 6 hours sometimes :-D

// Theo Schlossnagle
// Esoteric Curio: http://www.lethargy.org/~jesus/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Tom Lane

RPK <[EMAIL PROTECTED]> writes:
> I did not mean asking for undo from a life-time log. Since FlashBack
> Technology is already there, I just mean that world's most advanced database
> (PostgreSQL, as they say), must have an optimized way for undoing of at
> least a week changes.

You're living in a dream world.  Do you know any Oracle DBs who keep
enough rollback segments to go back a week?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread RPK

Andrew,

>> Demanding unlimited undo at some time that is arbitrarilly distant in the
>> future strikes me as wholly unreasonable. 

I did not mean asking for undo from a life-time log. Since FlashBack
Technology is already there, I just mean that world's most advanced database
(PostgreSQL, as they say), must have an optimized way for undoing of at
least a week changes. A week log is enough and PostgreSQL can keep on
removing old logs automatically.

Secondly, it must be left to the user to decide for the number of days of
archive he want to store. Again upto a week max.

RPK wrote:
>
> I agree that TimeStamp creates an overhead, but I just want to know if an
> accidental update happened to a table and this incident got traced three
> days after, what facility PGSQL provide to bring the table to its original
> condition. You can't wait regretting on why you did not run ROLLBACK
> before
> COMMIT. (Correct me. I am only a user).
>

Why the heck can't you create a reversing transaction? That's what
ordinary mortals do. Demanding unlimited undo at some time that is
arbitrarilly distant in the future strikes me as wholly unreasonable.

What do you mean by "accidental update"? What you really appear to mean is
that a program or a human operator has made an error, and incorrectly told
the database to commit a transaction. The answer surely is to correct the
behaviour of the program or human, rather than wanting the database to
provide an undo facility. Alternatively, this should be handled at the
application layer, using something like table_log.

Some things just don't work well with this sort of facility. Just ask your
bookie if you can undo a bet that you "accidentally" placed with him and
which, three days later, you discover (after the race) was a mistake.

cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

-- 
View this message in context: 
http://www.nabble.com/New-feature-request%3A-FlashBack-Query-tf3245023.html#a9067564
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] statement_timeout doesnt work within plpgsql by design?

2007-02-20 Thread Tom Lane

Robert Treat <[EMAIL PROTECTED]> writes:
> pagila=# create or replace function test() returns bool as $$ begin set 
> statement_timeout = 3000; perform pg_sleep(4) ; return true; end $$ language 
> plpgsql;
> CREATE FUNCTION

statement_timeout is measured across an entire interactive command, not
individual commands within a function; and the timeout that applies to
an interactive command is determined at its beginning.  So the above
doesn't do what you think.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] statement_timeout doesnt work within plpgsql by design?

2007-02-20 Thread Robert Treat

pagila=# select version();
   version
-
 PostgreSQL 8.2.3 on i386-pc-solaris2.10, compiled by cc -Xa
(1 row)

pagila=# create or replace function test() returns bool as $$ begin set 
statement_timeout = 3000; perform pg_sleep(4) ; return true; end $$ language 
plpgsql;
CREATE FUNCTION
pagila=# select test();
 test
--
 t
(1 row)

pagila=# select test();
ERROR:  canceling statement due to statement timeout
CONTEXT:  SQL statement "SELECT  pg_sleep(4)"
PL/pgSQL function "test" line 1 at perform


is this behavior by design?  if so why would you design it that way? :-) 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread August Zajonc

Tom Lane wrote:
> August Zajonc <[EMAIL PROTECTED]> writes:
>> The key is how lightweight the setup could be, which matters because
>> clients are not always willing to pay for a PITR setup. The low overhead
>> would mean you'd feel fine about setting guc to 1hr or so.
> 
> This would have exactly the same performance consequences as always
> having an hour-old open transaction.  I'm afraid that describing it
> as "low overhead" is mere wishful thinking: it would cripple vacuuming
> of high-update tables and greatly increase the typical load on pg_clog
> and pg_subtrans.  We already know that pg_subtrans contention can be a
> source of context-swap storms, with the size of the window back to
> GlobalXmin being the controlling factor for how bad it gets.
> 
> It's possible that this last could be addressed by separating the
> concept of "old enough to be vacuumed" from GlobalXmin, but it's
> certainly not a trivial thing.

Isn't globalxmin for open transactions? I thought the idea was that
everything goes as usual, but you can flip a knob and say that vacuum
doesn't vacuum anything more recent then GlobalXmin less x transactions.

Then you can look at your transactions per second and get a rough window
to work within. Or if there are timestamps on commits, that would switch
to a time interval more user friendly.

You end up simply delaying when 1hrs worth of transactions gets
vacuumed. For folks doing nightly cron job vacuums, not too bad.
Autovacuum isn't on by default :)

Of course, this will be clumsy if not per database.

But the thought might be to take advantage of the flashback data already
present under the MVCC model as long as vacuum hasn't hit things (and
being willing to stop activity on a database etc). Given that you are
delaying a vacuum rather then being more aggressive, and know you can
already vacuum up to a more recent transaction xmin, I dunno... Does
anything depend (other then performance) on vacuum actually vacuuming as
far as it can?

- August

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Gregory Stark

"Jonah H. Harris" <[EMAIL PROTECTED]> writes:

> On 2/20/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>
>> It's more like Oracle is optimized for data that's committed
>> long in the past and we're optimized for data that's
>> been recently updated.
>
> Wrong.  When Oracle says it's committed, it's committed.  No
> difference between when, where, and how.  In Oracle, the committed
> version is *always* the first presented to the user... 

Sure, and if it was committed long in the past then you can use it. If it's
committed recently then you'll have to start looking up rollback data instead.

The rest of your post seems to all be predicated on the idea that if data is
committed then that's all you'll need to look at. But that's missing precisely
the point of what I was saying:

>> In a sense then it's the opposite of what we usually say. Oracle is optimized
>> for mostly static data. Postgres is optimized for changing data.

By "changing data" I meant "data in flux", not the action of making changes to
the data. 

Looking at data in flux in Oracle -- even other data that's unchanged but
lives on the same page as some record that's in flux -- will require you to
look up rollback data and possibly even have to follow many pages of chained
rollback data. Looking at data in Postgres has no additional overhead when
it's data in flux versus old static data.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Multiple Storage per Tablespace, or Volumes

2007-02-20 Thread Andrew Sullivan

On Mon, Feb 19, 2007 at 07:10:52PM -0800, David Fetter wrote:
> 
> Isn't this one of the big use cases for table partitioning?

Sure, but you can't detach that data in the meantime, AFAIK.  Maybe
I've missed something.

If I have 10 years of finace data, and I have to keep it all online
all the time, my electricity costs alone are outrageous.  If I can
know _where_ the data is without being able to get it until I've
actually made it available, then my query tools could be smart enough
to detect "table partitioned; this data is offline", I could roll
back to my savepoint, return a partial result set to the user, and
tell it "call back in 24 hours for your full report".  

Yes, I know, hands waving in the air.  But I already said I was
having a "you know what would be sweet" moment.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Hannu Krosing

Ühel kenal päeval, T, 2007-02-20 kell 10:20, kirjutas Jonah H. Harris:
> On 2/20/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> > I used to say that too but I've since realized it's not really true.
> 
> Heh, take a joke man... I was following up on Drake's email :)
> 
> But, since you want to discuss your view of the systems openly... I'll
> gladly reply :)
> 
> > It's more like Oracle is optimized for data that's committed
> > long in the past and we're optimized for data that's
> > been recently updated.
> 
> Wrong.  When Oracle says it's committed, it's committed.  No
> difference between when, where, and how.  In Oracle, the committed
> version is *always* the first presented to the user... it takes time
> to go back and look at older versions; 

Older versions are also committed :)

He probably meant longer transactions and several versions visible to
different backends.

> but why shouldn't that be a bit slower, it isn't common practice anyway.

Not for pure OLAP, at least when you have fairly fast transactions. But
it can slow things down when you have some hotspot tables.

> Same with rollbacks... why
> should they optimize for them when 97% of transactions commit?

Or other way around, - you should write code, where most transactions
commit ;)

> > In Oracle the data that's been committed long in the past requires no
> > transactional overhead but the data that's been recently updated requires 
> > lots
> > of work to fetch the right version.
> 
> Wrong.  The same transactional overhead applies to *all* data in
> Oracle no matter of when it was committed.  Similarly, the only
> overhead required occurs when someone is querying in serializable
> isolation or on read-committed data before or during a commit.  On
> short OLTP-type transactions, Oracle has the most optimized solution.
> 
> > In Postgres it's the other way around. data that's been committed deleted 
> > long
> > ago requires extra work to clean up but data that's been recently changed
> > requires little additional work to see the correct version.
> 
> PostgreSQL has little additional work?  Like, checking the validity of
> every tuple?  Oracle checks visibility at the block level, so there's
> *much* less overhead. 

Hmm. How can it check visibility at block level and at the same time do
in-place updates on single tuples ?

>  Take most of the benchmarks which can hold ~200
> tuples per block.  Tables in those benchmarks are 100+ million rows.
> On a sequential scan, Oracle would perform 500K checks, PostgreSQL
> would perform *all* 100M checks (not counting dead versions due to row
> updates and the like). 

My proposal of keeping visibility info in a separate heap would help to
get similar results, that is mostly 1 check per page. That would also
cover much of the index lookup cases below.

> On an index scan, Oracle not only has a
> smaller index and less to check, but also knows the tuple will be
> committed and will, in most cases, not have to perform additional
> physical I/O to find the latest version of a row.

It is also the reason why you can forget about doing simultaneous data
loading and queries on the same table. If you know avoid doing that,
then it "will, in most cases, not have to perform additional physical
I/O to find the latest version of a row" ;)

> Of course, Oracle's design is much more complicated in its ability to
> build read-committed versions of the blocks at runtime; something the
> simplicity of PostgreSQL's MVCC design eliminates.
> 
> > In a sense then it's the opposite of what we usually say. Oracle is 
> > optimized
> > for mostly static data. Postgres is optimized for changing data.
> 
> Care to share an example to prove it?
> 
> Like always, there are pros and cons with both designs, but denying
> facts gets us nowhere.  We're off-topic now... so we should either
> move this off line or to another thread.  I personally don't see much
> of a reason to continue discussing MVCC designs anymore as Oracle's is
> patented and PostgreSQL's is highly unlikely to change drastically.

I don't think we will ever move to rollback segments, but for some
use-cases moving visibility to a separate heap could make sense.

And if we want to bring back time travel (see another thread about
"Flashback Queries"), then we may end up implementing the original
postgresql's design spec and make VACUUM spihon dead tuples over to
archive relations, which already starts looking a little like rollback
segments, only for other purposes :)

> As always, I'd suggest discussing improvements, not the status quo.
> Likewise, discussing Oracle's design, drawbacks, and limitations
> without having used it extensively is quite obvious to anyone familiar
> with Oracle.

Using a system extensively can also create blind spots about some of the
systems (mis)features. One learns to avoid doing some things without
consciously knowing about it.

> Don't get me wrong, it's fine to prefer one design to
> another, but pushing discussion items c

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread mark

On Tue, Feb 20, 2007 at 08:31:45PM +0530, Pavan Deolasee wrote:
> I see your point, but as you mentioned do we really care ? The chain
> needs to be broken so that the intermediate DEAD tuples can be
> vacuumed. We can't vacuum them normally because they could
> be a part of live HOT-update chain. Resetting the HOT-updated
> status of the root tuple helps to mark the index entry LP_DELETE
> once the entire HOT-update chain is dead.
> ...

For some reason this paragraph raised a query in my mind. Will we
be able to toggle this new "hot update" code at configure time, so
that we can measure what sort of effect this change has once it is
complete?

Even if only during the early testing cycles for the next release, I
think it would be useful.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Rod Taylor



Wrong.  When Oracle says it's committed, it's committed.  No
difference between when, where, and how.  In Oracle, the committed
version is *always* the first presented to the user... it takes time
to go back and look at older versions; but why shouldn't that be a bit
slower, it isn't common practice anyway.  Same with rollbacks... why
should they optimize for them when 97% of transactions commit?


Do 97% of transactions commit because Oracle has slow rollbacks and  
developers are working around that performance issue, or because they  
really commit?


I have watched several developers that would prefer to issue numerous  
selects to verify things like foreign keys in the application in  
order to avoid a rollback.


Anyway, I don't have experience with big Oracle applications but I'm  
not so sure that 97% of transactions would commit if rollbacks were  
cheaper.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Jonah H. Harris


On 2/20/07, Gregory Stark <[EMAIL PROTECTED]> wrote:

I used to say that too but I've since realized it's not really true.


Heh, take a joke man... I was following up on Drake's email :)

But, since you want to discuss your view of the systems openly... I'll
gladly reply :)


It's more like Oracle is optimized for data that's committed
long in the past and we're optimized for data that's
been recently updated.


Wrong.  When Oracle says it's committed, it's committed.  No
difference between when, where, and how.  In Oracle, the committed
version is *always* the first presented to the user... it takes time
to go back and look at older versions; but why shouldn't that be a bit
slower, it isn't common practice anyway.  Same with rollbacks... why
should they optimize for them when 97% of transactions commit?


In Oracle the data that's been committed long in the past requires no
transactional overhead but the data that's been recently updated requires lots
of work to fetch the right version.


Wrong.  The same transactional overhead applies to *all* data in
Oracle no matter of when it was committed.  Similarly, the only
overhead required occurs when someone is querying in serializable
isolation or on read-committed data before or during a commit.  On
short OLTP-type transactions, Oracle has the most optimized solution.


In Postgres it's the other way around. data that's been committed deleted long
ago requires extra work to clean up but data that's been recently changed
requires little additional work to see the correct version.


PostgreSQL has little additional work?  Like, checking the validity of
every tuple?  Oracle checks visibility at the block level, so there's
*much* less overhead.  Take most of the benchmarks which can hold ~200
tuples per block.  Tables in those benchmarks are 100+ million rows.
On a sequential scan, Oracle would perform 500K checks, PostgreSQL
would perform *all* 100M checks (not counting dead versions due to row
updates and the like).  On an index scan, Oracle not only has a
smaller index and less to check, but also knows the tuple will be
committed and will, in most cases, not have to perform additional
physical I/O to find the latest version of a row.

Of course, Oracle's design is much more complicated in its ability to
build read-committed versions of the blocks at runtime; something the
simplicity of PostgreSQL's MVCC design eliminates.


In a sense then it's the opposite of what we usually say. Oracle is optimized
for mostly static data. Postgres is optimized for changing data.


Care to share an example to prove it?

Like always, there are pros and cons with both designs, but denying
facts gets us nowhere.  We're off-topic now... so we should either
move this off line or to another thread.  I personally don't see much
of a reason to continue discussing MVCC designs anymore as Oracle's is
patented and PostgreSQL's is highly unlikely to change drastically.

As always, I'd suggest discussing improvements, not the status quo.
Likewise, discussing Oracle's design, drawbacks, and limitations
without having used it extensively is quite obvious to anyone familiar
with Oracle.  Don't get me wrong, it's fine to prefer one design to
another, but pushing discussion items comparing Oracle to PostgreSQL
because of things you've heard or read somewhere isn't the same as
understanding them because you've used them.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PATCHES] [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Pavan Deolasee

On 2/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote:

Tom Lane wrote:
>
> "Recently dead" means "still live to somebody", so those tids better not
> change either.  But I don't think that's what he meant.  I'm more
> worried about the deadlock possibilities inherent in trying to upgrade a
> buffer lock.  We do not have deadlock detection for LWLocks.

I am guessing he is going to have to release the lock, then ask for an
exclusive one.

Yes, thats what is done. Since we try to prune the HOT-update chain
even in the SELECT path, we upgrade the lock only if we are sure
that there is atleast one tuple that can be removed from the chain
or the root needs to be fixed (broken ctid chain for some reason).

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Pavan Deolasee

On 2/20/07, Tom Lane <[EMAIL PROTECTED]> wrote:

"Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> ... Yes. The HOT-updated status of the root and all intermediate
> tuples is cleared and their respective ctid pointers are made
> point to themselves.

Doesn't that destroy the knowledge that they form a tuple chain?
While it might be that no one cares any longer, it would seem more
reasonable to leave 'em chained together.

I see your point, but as you mentioned do we really care ? The chain
needs to be broken so that the intermediate DEAD tuples can be
vacuumed. We can't vacuum them normally because they could
be a part of live HOT-update chain. Resetting the HOT-updated
status of the root tuple helps to mark the index entry LP_DELETE
once the entire HOT-update chain is dead.

Also, if we decide to reuse the heap-only tuples without even
vacuuming, breaking the chain is a better option since we then
guarantee no references to the heap-only DEAD tuples.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

Re: [PATCHES] [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Pavan Deolasee wrote:
> >> When following a HOT-update chain from the index fetch, if we notice that
> >> the root tuple is dead and it is HOT-updated, we try to prune the chain to
> >> the smallest possible length. To do that, the share lock is upgraded to an
> >> exclusive lock and the tuple chain is followed till we find a
> >> live/recently-dead
> >> tuple. At that point, the root t_ctid is made point to that tuple. In order
> 
> > I assume you meant recently-dead here, rather than live/recently-dead,
> > because we aren't going to change live ctids, right?
> 
> "Recently dead" means "still live to somebody", so those tids better not
> change either.  But I don't think that's what he meant.  I'm more
> worried about the deadlock possibilities inherent in trying to upgrade a
> buffer lock.  We do not have deadlock detection for LWLocks.

I am guessing he is going to have to release the lock, then ask for an
exclusive one.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PATCHES] [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Pavan Deolasee wrote:
>> When following a HOT-update chain from the index fetch, if we notice that
>> the root tuple is dead and it is HOT-updated, we try to prune the chain to
>> the smallest possible length. To do that, the share lock is upgraded to an
>> exclusive lock and the tuple chain is followed till we find a
>> live/recently-dead
>> tuple. At that point, the root t_ctid is made point to that tuple. In order

> I assume you meant recently-dead here, rather than live/recently-dead,
> because we aren't going to change live ctids, right?

"Recently dead" means "still live to somebody", so those tids better not
change either.  But I don't think that's what he meant.  I'm more
worried about the deadlock possibilities inherent in trying to upgrade a
buffer lock.  We do not have deadlock detection for LWLocks.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Bruce Momjian

Pavan Deolasee wrote:
> On 2/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote:
> >
> > Pavan Deolasee wrote:
> > > When following a HOT-update chain from the index fetch, if we notice
> > that
> > > the root tuple is dead and it is HOT-updated, we try to prune the chain
> > to
> > > the smallest possible length. To do that, the share lock is upgraded to
> > an
> > > exclusive lock and the tuple chain is followed till we find a
> > > live/recently-dead
> > > tuple. At that point, the root t_ctid is made point to that tuple. In
> > order
> >
> > I assume you meant recently-dead here, rather than live/recently-dead,
> > because we aren't going to change live ctids, right?
> 
> 
> No, I meant live or recently-dead (in fact, anything  other than
> HEAPTUPLE_DEAD
> or HEAPTUPLE_DEAD_CHAIN).
> 
> We are not changing the tids here, but only pruning the HOT-update chain.
> After pruning, the root->t_ctid points to the oldest tuple that might be
> visible to any backend. The live tuples are still identified by their
> original tid and index reachable from the root tuple.

I am confused.  Where is the root->t_ctid stored?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Pavan Deolasee

On 2/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote:

Pavan Deolasee wrote:
> When following a HOT-update chain from the index fetch, if we notice
that
> the root tuple is dead and it is HOT-updated, we try to prune the chain
to
> the smallest possible length. To do that, the share lock is upgraded to
an
> exclusive lock and the tuple chain is followed till we find a
> live/recently-dead
> tuple. At that point, the root t_ctid is made point to that tuple. In
order

I assume you meant recently-dead here, rather than live/recently-dead,
because we aren't going to change live ctids, right?

No, I meant live or recently-dead (in fact, anything  other than
HEAPTUPLE_DEAD
or HEAPTUPLE_DEAD_CHAIN).

We are not changing the tids here, but only pruning the HOT-update chain.
After pruning, the root->t_ctid points to the oldest tuple that might be
visible to any backend. The live tuples are still identified by their
original tid and index reachable from the root tuple.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] ToDo: add documentation for operator IS OF

2007-02-20 Thread Pavel Stehule




> what code do you want to remove?

Peter suggested removing the IS OF code itself.



without compensation you hip lot of people. Time for removing was 4 years 
ago.


Regards
Pavel Stehule

_
Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. 
http://www.msn.cz/



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] ToDo: add documentation for operator IS OF

2007-02-20 Thread Bruce Momjian

Pavel Stehule wrote:
> > > documentation addition and add a comment to gram.y?
> >
> >OK, I have votes to remove the code, remove the documentation, and keep
> >all of it but document its behavior might change in the future.
> 
> what code do you want to remove?

Peter suggested removing the IS OF code itself.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Bruce Momjian

Pavan Deolasee wrote:
> When following a HOT-update chain from the index fetch, if we notice that
> the root tuple is dead and it is HOT-updated, we try to prune the chain to
> the smallest possible length. To do that, the share lock is upgraded to an
> exclusive lock and the tuple chain is followed till we find a
> live/recently-dead
> tuple. At that point, the root t_ctid is made point to that tuple. In order

I assume you meant recently-dead here, rather than live/recently-dead,
because we aren't going to change live ctids, right?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Tom Lane

"Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> ... Yes. The HOT-updated status of the root and all intermediate
> tuples is cleared and their respective ctid pointers are made
> point to themselves.

Doesn't that destroy the knowledge that they form a tuple chain?
While it might be that no one cares any longer, it would seem more
reasonable to leave 'em chained together.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] ToDo: add documentation for operator IS OF

2007-02-20 Thread Pavel Stehule


> documentation addition and add a comment to gram.y?

OK, I have votes to remove the code, remove the documentation, and keep
all of it but document its behavior might change in the future.


what code do you want to remove?



I will leave the documentation, but comment it out so it doesn't appear
in the output, and state why in the comment.

I will also add a TODO item.

--
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +


_
Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] ToDo: add documentation for operator IS OF

2007-02-20 Thread Bruce Momjian

bruce wrote:
> Joe Conway wrote:
> > Bruce Momjian wrote:
> > > Pavel Stehule wrote:
> > >> Hello,
> > >>
> > >> I miss doc for this operator
> > > 
> > > Strang IS [NOT] OF wasn't documented, especially seeing it was added in
> > > PostgreSQL 7.3.  Anyway, documented and backpatched to 8.2.X.
> > 
> > Here's the reason -- see this thread:
> > http://archives.postgresql.org/pgsql-patches/2003-08/msg00062.php
> 
> Wow, interesting.  I do remember that now.  Should I revert the
> documentation addition and add a comment to gram.y?

OK, I have votes to remove the code, remove the documentation, and keep
all of it but document its behavior might change in the future.

I will leave the documentation, but comment it out so it doesn't appear
in the output, and state why in the comment.

I will also add a TODO item.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Plan invalidation design

2007-02-20 Thread Alvaro Herrera

Simon Riggs wrote:
> On Sat, 2007-02-17 at 12:48 -0500, Tom Lane wrote:
> 
> > Relcache inval casts a fairly wide net; for example, adding or dropping an
> > index will invalidate all plans using the index's table whether or not
> > they used that particular index, and I believe that VACUUM will also
> > result in a relcache inval due to updating the table's pg_class row.
> > I think this is a good thing though --- for instance, after adding an
> > index it seems a good idea to replan to see if the new index is useful,
> > and replanning after a VACUUM is useful if the table has changed size
> > enough to warrant a different plan.  OTOH this might mean that plans on a
> > high-update-traffic table never survive very long because of autovacuum's
> > efforts.  If that proves to be a problem in practice we can look at ways
> > to dial down the number of replans, but for the moment I think it's more
> > important to be sure we *can* replan at need than to find ways to avoid
> > replans.
> 
> Just some info on that: In an update-intensive scenario, I'm seeing
> VACUUMs every 2 minutes on the heaviest hit tables on CVS HEAD on a
> medium-powered 4-CPU server. Re-planning multiple queries on 100+
> sessions every few minutes would not be good.

I would think the inval would be sent if relpages changed by more than a
certain threshold, say 10%.  In steady state, a high-update table that's
under continuous vacuum should not change size much, thus no replan.

But clearly the point here is to get the inval to be sent at all, and
look for inhibitions mechanisms later.

> Presumably ANALYZE would have the same effect?

It would be nice to have a way to calculate a delta from the previous
statistics snapshot and send an inval if it's appropriate.  Can it be
done?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements

2007-02-20 Thread Greg Smith


On Tue, 20 Feb 2007, Tom Lane wrote:


I can't believe that any production situation could tolerate the
overhead of one-commit-per-log-line.


There aren't that many log lines, and a production environment with lots 
of commit throughput won't even notice.  The installation I work on tuning 
does 300 small commits per second on a bad day.  I can barely measure the 
overhead of whether or not the log files are involved in that if I'm 
importing them at the same time.  The situation obviously changes if 
you're logging per-query level detail.


So a realistic tool for this is going to have to be able to wrap blocks 
of maybe 100 or 1000 or so log lines with BEGIN/COMMIT, and that is 
exactly as difficult as wrapping them with a COPY command.  Thus, I 
disbelieve your argument. We should not be designing this around an 
assumed use-case that will only work for toy installations.


Wrapping the commits in blocks to lower overhead is appropriate for toy 
installations, and probably medium sized ones too.  Serious installations, 
with battery-backed cache writes and similar commit throughput enhancing 
hardware, can commit a low-volume stream like the logs whenever they 
please.  That's the environment my use-case comes from.


Anyway, it doesn't really matter; I can build a tool with COPY style 
output as well, it just won't be trivial like the INSERT one would be. 
My reasons for "would slightly prefer INSERT" clearly aren't strong enough 
to override the issues you bring up with the average case.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Andrew Dunstan

RPK wrote:
>
> I agree that TimeStamp creates an overhead, but I just want to know if an
> accidental update happened to a table and this incident got traced three
> days after, what facility PGSQL provide to bring the table to its original
> condition. You can't wait regretting on why you did not run ROLLBACK
> before
> COMMIT. (Correct me. I am only a user).
>

Why the heck can't you create a reversing transaction? That's what
ordinary mortals do. Demanding unlimited undo at some time that is
arbitrarilly distant in the future strikes me as wholly unreasonable.

What do you mean by "accidental update"? What you really appear to mean is
that a program or a human operator has made an error, and incorrectly told
the database to commit a transaction. The answer surely is to correct the
behaviour of the program or human, rather than wanting the database to
provide an undo facility. Alternatively, this should be handled at the
application layer, using something like table_log.

Some things just don't work well with this sort of facility. Just ask your
bookie if you can undo a bet that you "accidentally" placed with him and
which, three days later, you discover (after the race) was a mistake.

cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] ToDo: add documentation for operator IS OF

2007-02-20 Thread Pavel Stehule


Am Dienstag, 20. Februar 2007 02:13 schrieb Bruce Momjian:
> Wow, interesting.  I do remember that now.  Should I revert the
> documentation addition and add a comment to gram.y?

I'd say remove the code.



Propably nobody use it for inheritancy, but some people (I am too) use it in 
polymorphic functions. You can remove it, but please replace it.


Regards
Pavel Stehule

_
Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread RPK

I agree that TimeStamp creates an overhead, but I just want to know if an
accidental update happened to a table and this incident got traced three
days after, what facility PGSQL provide to bring the table to its original
condition. You can't wait regretting on why you did not run ROLLBACK before
COMMIT. (Correct me. I am only a user).

When talking about Oracle's technology and that it creates overhead, it is
true, Oracle's database is not for ordinary machines. You can't expect
performance on a normal 256 MB machine with Oracle. But still the more the
options of recovery the best for mission critical environments.

The feature of enabling/disabling TimeStamp logging is acceptable. A user
must be able to decide whether FlashBack type option is needed or not. In
Oracle 10g we can switch off "FlashBack" feature if we are low on resources.
If PGSQL is to be used in a mission-critical situation then no company will
rely on low-end machines. For these type of situations best environment is
chosen and I think PGSQL must have this type of recovery options. PGSQL
installer can ask the user during setup to enable/disable TimeStamp Logging.

Restoring the database from a backup file that was created three days ago is
not feasible. The changes in other tables and the new things created need to
be done again at the price of just undoing the last update on a particular
table.

Warren Turkal-5 wrote:
> 
> On Saturday 17 February 2007 07:49, RPK wrote:
>> PostgreSQL, already a mature database, needs to have more options for
>> recovery as compared to proprietary databases. I just worked with
>> Oracle's
>> FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g.
>>
>> Future versions of PostgreSQL must have similar features which enable
>> users
>> to bring Table(s) and/or Database(s) to a desired Time Stamp.
> 
> Check out my proposal[1] for Temporal extensions. Ultimately, creating
> valid 
> time and transaction time tables would be possible through my proposal.
> Please 
> check it out.
> 
> [1]http://archives.postgresql.org/pgsql-hackers/2007-02/msg00540.php
> 
> wt
> -- 
> Warren Turkal (w00t)
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match
> 
> 

-- 
View this message in context: 
http://www.nabble.com/New-feature-request%3A-FlashBack-Query-tf3245023.html#a9059865
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] New version of IDENTITY/GENERATED

2007-02-20 Thread Zoltan Boszormenyi


Hi,

I started working on my previous patch, encouraged
by the fact that it became a wishlist item for 8.3. :-)

The changes in this version are:
- Refreshed to almost current (5 days old)
 CVS version of 8.3 devel
- The original SERIAL pseudo type is left alone,
 you _have to_ spell out GENERATED
 { ALWAYS | BY DEFAULT} AS IDENTITY
 to get an identity column.
- The "action-at-a-distance" behaviour is actually working
 for the IDENTITY/GENERATED columns on INSERT
 so the DEFAULT value is generated for them
 after all the regular columns were validated via
 ExecConstraints(). This way, if the validation fails,
 the sequence isn't inflated.
- Test case is updated to reflect the above.
- Documentation is updated, "Identity columns" have a new
 subsection now.
- Dropped my pg_dump changes, as the altered sequence is
 also dumped in 8.2, thanks to Tom Lane.

I am considering the following:
- Since the IDENTITY is a new feature (plain old SERIAL
 behaves the same as always) I will restore the SQL:2003
 confromant check that there can be only one identity column
 in a table at any time.
- I read somewhere (but couldn't find it now in SQL:2003)
 that CHECK constraints cannot be defined for GENERATED
 (and IDENTITY?) columns. Maybe it was in the latest draft,
 I have to look at it... Anyway, I have to implement checks
 to disallow CHECKs for such columns.
- Introduce an ALTER TABLE SET|DROP IDENTITY so
 a serial can be "upgraded" to an identity. This way, an identity
 column can be built by hand and pg_dump will need it, too.
 SET IDENTITY will either have to issue an error if CHECKs
 defined for such columns or automatically drop every such
 constraints.

And I have a question, too. Is there a way to use ExecEvalExpr*()
so values from a given tuples are used for "current" row? E.g.
at present, UPDATE table SET f1 = f1 + 1, f2 = f1 + 1;
sets both fields' new value to (f1 value before UPDATE) + 1.
For a GENERATED column, value _after_ UPDATE
is needed, so
CREATE TABLE table (
  f1 INTEGER,
  f2 INTEGER GENERATED ALWAYS AS (f1 + 1));
and no matter which one of the following is used:
UPDATE table SET f1 = f1 + 1;
or
UPDATE table SET f1 = f1 + 1, f2 = default;
the f2 current value = f1 current value + 1 is always maintained.

Best regards,
Zoltán Böszörményi



psql-serial-30.diff.gz
Description: Unix tar archive

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] pg_proc without oid?

2007-02-20 Thread Peter Eisentraut

Am Dienstag, 20. Februar 2007 09:24 schrieb Magnus Hagander:
> Ok. Will do once the entires in pg_proc are changed, so that I can still
> build.

It's done.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] ToDo: add documentation for operator IS OF

2007-02-20 Thread Peter Eisentraut

Am Dienstag, 20. Februar 2007 02:13 schrieb Bruce Momjian:
> Wow, interesting.  I do remember that now.  Should I revert the
> documentation addition and add a comment to gram.y?

I'd say remove the code.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Gregory Stark

"Jonah H. Harris" <[EMAIL PROTECTED]> writes:

> On 2/17/07, Joshua D. Drake <[EMAIL PROTECTED]> wrote:
>> My understanding is that the main difference is that rollbacks are
>> inexpensive for us, but expensive for Oracle.
>
> Yes, Oracle is optimized for COMMIT, we're optimized for ROLLBACK :)

I used to say that too but I've since realized it's not really true. It's more
like Oracle is optimized for data that's committed long in the past and we're
optimized for data that's been recently updated. 

In Oracle the data that's been committed long in the past requires no
transactional overhead but the data that's been recently updated requires lots
of work to fetch the right version. 

In Postgres it's the other way around. data that's been committed deleted long
ago requires extra work to clean up but data that's been recently changed
requires little additional work to see the correct version.

In a sense then it's the opposite of what we usually say. Oracle is optimized
for mostly static data. Postgres is optimized for changing data.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] pg_proc without oid?

2007-02-20 Thread Magnus Hagander

On Mon, Feb 19, 2007 at 11:18:36AM -0500, Tom Lane wrote:
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > Am Montag, 19. Februar 2007 16:50 schrieb Tom Lane:
> >> In the second place, if you don't want to predetermine OIDs for your
> >> functions then they shouldn't be in hardwired pg_proc.h rows at all.
> 
> > Where else would you put them?
> 
> SQL script maybe, much along the lines Greg was just mentioning.
> (I'd been thinking myself earlier that pg_amop/amproc/etc would be a
> whole lot easier to maintain if we could feed CREATE OPERATOR CLASS
> commands to the bootstrap process.)  But getting there will take
> nontrivial work; you can't just decide to leave out a few OIDs on the
> spur of the moment.
> 
> Magnus, I'd suggest reverting whatever you did to your MSVC script,
> so we'll find out the next time someone makes this mistake...

Ok. Will do once the entires in pg_proc are changed, so that I can still
build.

BTW, another problem with the stuff that's in there now - pg_proc.h
contains description entries for the functions, but that never goes in
to pg_description, since there is no oid to bind it to...

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-20 Thread Pavan Deolasee

On 2/20/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:

Ühel kenal päeval, T, 2007-02-20 kell 12:08, kirjutas Pavan Deolasee:

What do you do, if there are no live tuples on the page ? will this
un-HOTify the root and free all other tuples in HOT chain ?

Yes. The HOT-updated status of the root and all intermediate
tuples is cleared and their respective ctid pointers are made
point to themselves. The index entry will be marked LP_DELETE
as with the normal case. VACUUM can subsequently reclaimed these
tuples, along with the index entry.

>
> The intermediate heap-only tuples are  removed from the HOT-update
> chain.
> The HOT-updated status of these tuples is cleared and their respective
> t_ctid are made point to themselves. These tuples are not reachable
> now and ready for vacuuming.

Does this mean, that they are now indistinguishable from ordinary
tuples ?

No. HEAP_ONLY_TUPLE flag is still set on these tuples. So you
can distinguish those tuples.

Maybe they could be freed right away instead of changing HOT-updated

status and ctid ?

Yeah, thats a good idea. I am thinking of setting LP_DELETE flag on them
while pruning. The tuple then can be reused for next in-page HOT-update.

> When we run out space for update-within-the-block, we traverse
> through all the line pointers looking for LP_DELETEd items. If any of
> these
> items have space large enough to store the new tuple, that item is
> reused.
> Does anyone see any issue with doing this ? Also, any suggestions
> about doing it in a better way ?

IIRC the size is determined by the next tuple pointer, so you can store
new data without changing tuple pointer only if they are exactly the
same size.

There is a lp_len field in the line pointer to store the length of the
tuple. ISTM that we can reduce that while reusing the line pointer. But
that would create a permanent hole in the page.

> we are
> more concerned about the large tables, the chances of being able to
> upgrade
> the exclusive lock to vacuum-strength lock are high. Comments ?

I'm not sure about the "we are more concerned about the large tables"
part. I see it more as a device for high-update tables. This may not
always be the same as "large", so there should be some fallbacks for
case where you can't get the lock. Maybe just give up and move to
another page ?

Oh, yes. I agree. The fallback option of doing COLD update always
exists.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

60 matches

Mail list logo