Re: [GENERAL] Buglist

2003-08-28 Thread Christopher Browne
After a long battle with technology,[EMAIL PROTECTED] (Vivek Khera), an earthling, 
wrote:
>> "TL" == Tom Lane <[EMAIL PROTECTED]> writes:
>
> TL> Just nice'ing the VACUUM process is likely to be counterproductive
> TL> because of locking issues (priority inversion).  Though if anyone cares
> TL> to try it on a heavily-loaded system, I'd be interested to hear the
> TL> results...
>
> tried it once.  didn't make much difference except that vacuum took
> longer than normal.  i didn't see any deadlocks.
>
> i actually figured out what my main problem was.  vacuum every 6 hours
> on my two busiest tables was taking longer than 6 hours when we were
> very busy...

I "wedged" a database server once that way; it was busy, busy, busy
with a multiplicity of processes trying to simultaneously vacuum the
same table.

The "new generation" resolution to that is pg_autovacuum; if you're
running a pre-7.3 version, a good idea is basically to have a vacuum
script that checks a "lock file" and exits if it sees that another
process is already busy vacuuming.
-- 
output = reverse("gro.mca" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/postgresql.html
"I am aware of the benefits  of a micro kernel approach.  However, the
fact remains  that Linux is  here, and GNU  isn't --- and  people have
been working on Hurd for a lot longer than Linus has been working on
Linux." -- Ted T'so, 1992.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-28 Thread Christopher Browne
After a long battle with technology,[EMAIL PROTECTED] (Jan Wieck), an earthling, wrote:
> Tom Lane wrote:
>> Jan Wieck <[EMAIL PROTECTED]> writes:
>>
>>> What about a little hint to the buffer management that if it has to
>>> evict another buffer to physically read this one (meaning the
>>> buffer pool was full already) then it will not put this buffer at
>>> the top of the LRU chain but rather at it's end? This way a vacuum
>>> on a large table will not cause a complete cache eviction.
>> I think what we really need is a way to schedule VACUUM's I/O at a
>> lower
>> priority than normal I/Os.  Wouldn't be very portable :-( ... but if the
>> OS offers a facility for requesting this, it'd be worth experimenting
>> with.
>
> Whatever priority it has, I think the fact that a VACUUM is kicking
> everything out of a carefully populated buffer cache and possibly
> replacing it with data of low to no interest at all should have some
> space for improvement. And that one single optimizer mistake choosing
> a seqscan over an index scan for a huge table does the same doesn't
> strike me as smart either.

[Thinking out loud...]

There wouldn't be some way of marking the pages that are read in for
a VACUUM as having the lowest possible priority, would there?

It's no grand insult if VACUUM consumes _some_ buffer cache pages, but
if it were set at low priority, once read in, would it not be possible
for those blocks to quickly get reused by the later VACUUM work, thus
limiting the degree to which the cache of _useful_ data got spilled
out?

I suppose this might turn a simple LRU queue into something resembling
a priority queue, but it's a thought...
-- 
If this was helpful,  rate me
http://www3.sympatico.ca/cbbrowne/sap.html
"He who  writes the code gets  to choose his license,  and nobody else
gets to complain." -- Linus Torvalds

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Buglist

2003-08-28 Thread Bruce Momjian
Vivek Khera wrote:
> > "BM" == Bruce Momjian <[EMAIL PROTECTED]> writes:
> 
> 
> BM> Hold --- a non-FULL vacuum is taking 6+ hours on two tables?  That seems
> BM> impossible.
> 
> Well, did I mention I'm saturating my disk I/O bandwidth at the same
> time with other queries? ;-)

But six hours.  It is my understanding that a sequential scan is roughly
the same load as a non-FULL vacuum.  Are you saying a sequential scan
takes +6 hours too?  How can any work get done?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-28 Thread Vivek Khera
> "BM" == Bruce Momjian <[EMAIL PROTECTED]> writes:

>> 
>> Well, did I mention I'm saturating my disk I/O bandwidth at the same
>> time with other queries? ;-)

BM> But six hours.  It is my understanding that a sequential scan is roughly
BM> the same load as a non-FULL vacuum.  Are you saying a sequential scan
BM> takes +6 hours too?  How can any work get done?

Well, it is a big downward spiral once you saturate your disks.  You
can't get queries done quickly enough, and table size increases with
all those updates, and you have to run vacuum to help that, and that
slows the queries more.  Lather rinse repeat.

The new server, with 14 ultra-160 SCSI disks on a hardware RAID10 is
going to go to the data center tomorrow, and hopefully will be live by
the weekend (if I can get eRServer to replicate the data to the slave
without having to shutdown the whole system).

Then I'll let ya know how long the vacuum takes ;-)

PS: Last time I tried a vauum full on my largest table, I gave up
after 14 hours of down time.  That was not good for our business...

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-26 Thread Vivek Khera
> "TL" == Tom Lane <[EMAIL PROTECTED]> writes:

TL> Just nice'ing the VACUUM process is likely to be counterproductive
TL> because of locking issues (priority inversion).  Though if anyone cares
TL> to try it on a heavily-loaded system, I'd be interested to hear the
TL> results...

tried it once.  didn't make much difference except that vacuum took
longer than normal.  i didn't see any deadlocks.

i actually figured out what my main problem was.  vacuum every 6 hours
on my two busiest tables was taking longer than 6 hours when we were
very busy...

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Buglist

2003-08-26 Thread Jan Wieck
Bruce Momjian wrote:

Jan Wieck wrote:
Manfred Koizar wrote:
> On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
> <[EMAIL PROTECTED]> wrote:
>>Point I am trying to make is to tune FSM and autovacuum frequency 
>>such that you catch all the dead tuples in RAM
> 
> You might be able to catch the pages with dead tuples in RAM, but
> currently there's no way to keep VACUUM from reading in all the clean
> pages, which can be far more ...

Which leads us to a zero gravity vacuum, that does the lazy vacuum for 
pages currently available in the buffer cache only. And another pg_stat 
column telling the number of tuples vacuumed so that an autovac has a 
chance to avoid IO consuming vacuum runs for relations where 99% of the 
dead tuples have been caught in memory.
What would be really interesting is to look for dead tuples when you
write/discard a buffer page and add them to the FSM --- that is probably
the latest time you still have access to the page and has the highest
probability of being recyclable.
True, but it's again in the time critical path of a foreground 
application because it's done by a backend who has to read another page 
on behalf of a waiting client right now. Also, there is only a small 
probability that all the pages required to do the index purge for the 
tuples reclaimed are in memory too. Plus there is still no direct 
connection between a heap tuples ctid and the physical location of it's 
index tuples, so purging an index requires a full scan of it, which is 
best done in bulk operations.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-26 Thread Bruce Momjian
Jan Wieck wrote:
> Manfred Koizar wrote:
> > On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
> > <[EMAIL PROTECTED]> wrote:
> >>Point I am trying to make is to tune FSM and autovacuum frequency 
> >>such that you catch all the dead tuples in RAM
> > 
> > You might be able to catch the pages with dead tuples in RAM, but
> > currently there's no way to keep VACUUM from reading in all the clean
> > pages, which can be far more ...
> 
> Which leads us to a zero gravity vacuum, that does the lazy vacuum for 
> pages currently available in the buffer cache only. And another pg_stat 
> column telling the number of tuples vacuumed so that an autovac has a 
> chance to avoid IO consuming vacuum runs for relations where 99% of the 
> dead tuples have been caught in memory.

What would be really interesting is to look for dead tuples when you
write/discard a buffer page and add them to the FSM --- that is probably
the latest time you still have access to the page and has the highest
probability of being recyclable.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-25 Thread Franco Bruno Borghesi




Initial beta release of plPHP http://www.postgresql.org/news/143.html

On Tue, 2003-08-19 at 10:46, David Siebert wrote:

I learned MySQL then went on to Postgres. I chose postgres for my in 
house project just because of the row locking and transactions. Looking 
back I could have used MySQL. I have yet to use stored procedures or 
many of the high level functions of Postgres however transactions make 
things so much cleaner. I do not think MySQL is a bad system. It works 
well for many people in many situations. I think that MySQL and SAP 
getting together could be very exciting. When it comes to SQL databases 
I would say we have a wealth good choices. This if I use PHP I have to 
use MySQL is a load of tripe. PHP can work just fine with Postgres. I 
hate to even suggest this but has anyone thought of  adding PHP to the 
languages that you can use to write stored procedures in Postgres?


Roderick A. Anderson wrote:

>On 19 Aug 2003, Bo Lorentsen wrote:
>
>  
>
>>Also have anyone tryed to compare the new transaction model in MySQL 4.x
>>to PostgreSQL ?
>>
>>
>
>Bo,  I've recently started having to deal with MySQL. (Web sites 
>wanting/using php _need/have-to-have_ MySQL. Their words not mine.)  And 
>from going from a "I dislike MySQL" to "I'm really hating MySQL" has been 
>getting easier and easier.
>   My dealings with MySQL are for the 3.xx version but I semi-followed a
>thread on this several months ago so feel fully qualified to to throw in
>my views.  :-)  My take on others research was that MySQL transaction
>model is a bubble gum and bailing wire add on not an integral part of
>MySQL.  It _was_ tacked onto the top of the database so if either it or
>MySQL failed you were likely to loose data.
>
>  
>
>>I'm looking forward to recive even more constructive arguements :-)
>>
>>
>
>How about "Friends don't let friends use MySQL"?
>
>Hopefully others with a stonger knowledge will provide this.
>
>
>Rod
>  
>



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly






signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Buglist

2003-08-22 Thread Jan Wieck
Claudio Lapidus wrote:

Bruno Wolff III wote:
On Fri, Aug 22, 2003 at 12:17:41 +0530,
  Shridhar Daithankar <[EMAIL PROTECTED]> wrote:
>
> Idea of autovacuum is to reduce load on vacuum full. If you set
shared_buffers
> higher and FSM properly for he update/delete load, autovacuum is
expected to
> catch most of the dead tuples in shared cache only. If it is successful
in
> doubling the frequency on vacuum full, that's a big win, isn't it?

If you run a normal vacuum often enough, you shouldn't need to regularly
run vacuum full.
Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
time, lay there for a couple of hours and get deleted "the other end
around". We run normal vacuum almost constantly, but the table keeps
growing. We had to implement a 'vacuum full' once a week to keep it under
control.
What is the size of your database, how many tables do you have and what 
are your FSM settings?

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [GENERAL] Buglist

2003-08-22 Thread scott.marlowe
On Fri, 22 Aug 2003, Claudio Lapidus wrote:

> Bruno Wolff III wote:
> > On Fri, Aug 22, 2003 at 12:17:41 +0530,
> >   Shridhar Daithankar <[EMAIL PROTECTED]> wrote:
> > >
> > > Idea of autovacuum is to reduce load on vacuum full. If you set
> shared_buffers
> > > higher and FSM properly for he update/delete load, autovacuum is
> expected to
> > > catch most of the dead tuples in shared cache only. If it is successful
> in
> > > doubling the frequency on vacuum full, that's a big win, isn't it?
> >
> > If you run a normal vacuum often enough, you shouldn't need to regularly
> > run vacuum full.
> 
> Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
> time, lay there for a couple of hours and get deleted "the other end
> around". We run normal vacuum almost constantly, but the table keeps
> growing. We had to implement a 'vacuum full' once a week to keep it under
> control.

You may just need to increase your fsm settings in postgresql.conf to get 
a regular vacuum to work the way you want.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Buglist

2003-08-22 Thread Shridhar Daithankar
On 22 Aug 2003 at 12:35, Claudio Lapidus wrote:

> Bruno Wolff III wote:
> > On Fri, Aug 22, 2003 at 12:17:41 +0530,
> >   Shridhar Daithankar <[EMAIL PROTECTED]> wrote:
> > >
> > > Idea of autovacuum is to reduce load on vacuum full. If you set
> shared_buffers
> > > higher and FSM properly for he update/delete load, autovacuum is
> expected to
> > > catch most of the dead tuples in shared cache only. If it is successful
> in
> > > doubling the frequency on vacuum full, that's a big win, isn't it?
> >
> > If you run a normal vacuum often enough, you shouldn't need to regularly
> > run vacuum full.
> 
> Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
> time, lay there for a couple of hours and get deleted "the other end
> around". We run normal vacuum almost constantly, but the table keeps
> growing. We had to implement a 'vacuum full' once a week to keep it under
> control.

I think you could benefit from some kind of emulated partitioning.. If there 
are large number of rows getting deleted at the end of two hours, you could 
just drop that sub table..

Bye
 Shridhar

--
We come to bury DOS, not to praise it.(Paul Vojta, [EMAIL PROTECTED], 
paraphrasing a quote of Shakespeare)


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-22 Thread Claudio Lapidus
Bruno Wolff III wote:
> On Fri, Aug 22, 2003 at 12:17:41 +0530,
>   Shridhar Daithankar <[EMAIL PROTECTED]> wrote:
> >
> > Idea of autovacuum is to reduce load on vacuum full. If you set
shared_buffers
> > higher and FSM properly for he update/delete load, autovacuum is
expected to
> > catch most of the dead tuples in shared cache only. If it is successful
in
> > doubling the frequency on vacuum full, that's a big win, isn't it?
>
> If you run a normal vacuum often enough, you shouldn't need to regularly
> run vacuum full.

Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
time, lay there for a couple of hours and get deleted "the other end
around". We run normal vacuum almost constantly, but the table keeps
growing. We had to implement a 'vacuum full' once a week to keep it under
control.

cl.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-21 Thread Jan Wieck
Manfred Koizar wrote:
On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
<[EMAIL PROTECTED]> wrote:
Point I am trying to make is to tune FSM and autovacuum frequency 
such that you catch all the dead tuples in RAM
You might be able to catch the pages with dead tuples in RAM, but
currently there's no way to keep VACUUM from reading in all the clean
pages, which can be far more ...
Which leads us to a zero gravity vacuum, that does the lazy vacuum for 
pages currently available in the buffer cache only. And another pg_stat 
column telling the number of tuples vacuumed so that an autovac has a 
chance to avoid IO consuming vacuum runs for relations where 99% of the 
dead tuples have been caught in memory.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-21 Thread Manfred Koizar
On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
<[EMAIL PROTECTED]> wrote:
>Point I am trying to make is to tune FSM and autovacuum frequency 
>such that you catch all the dead tuples in RAM

You might be able to catch the pages with dead tuples in RAM, but
currently there's no way to keep VACUUM from reading in all the clean
pages, which can be far more ...

Servus
 Manfred

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-21 Thread Edmund Dengler
What I am pointing out is that this is all the same issue, and that
solutions to the "we can't do priorities because of locking issues" have
existed for many years. I/O is the same as processors, it is a resource
that needs managing. So the intelligence can be made to exist, it just
needs to be made.

Now onto other questions: can vacuuming be done without locks? Can it be
done in parts (ie, lock only a bit)? Can the I/O be better managed? Is
this a general model that would work well?

I have plenty of queries that I would love to run on a "as the system
allows" basis, or on a "keep a bit of spare cycles or I/O for the
important stuff", but which I cannot specify. So a vote from me for any
mechanism that allows priorities to be specified. If this is a desired
feature, then comes the hard part of what is feasible, what can be done in
a reasonable amount of time, and of doing it.

Regards!
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

> On Thu, Aug 21, 2003 at 12:05:28PM -0400, Edmund Dengler wrote:
> > Well, if they are locked waiting on vacuum, then vacuum should upgrade
> > it's priority to the highest waiting process (priority inheritance).
> > This way, vacuum will be running at a priority level equivalent to who is
> > waiting on it.
>
> Right, but all that intelligence is something that isn't in there
> now.  And anyway, the real issue is I/O, not processor.
>
> A
>
> --
> 
> Andrew Sullivan 204-4141 Yonge Street
> Liberty RMS   Toronto, Ontario Canada
> <[EMAIL PROTECTED]>  M2P 2A8
>  +1 416 646 3304 x110
>
>
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Buglist

2003-08-21 Thread Andrew Sullivan
On Thu, Aug 21, 2003 at 09:10:34PM +0530, Shridhar Daithankar wrote:
> Well, nothing can help if the database has dead tuples already.
> Sometime somebody has to take time to run vacuum full and/or
> database reload to get a clean state.

But if you have a busy system, you'll have new dead tuples.

> Point I am trying to make is to tune FSM and autovacuum frequency
> such that you catch all the dead tuples in RAM, which is
> non-blocking operation at the expense of some CPU power. I am sure
> 1 min autovacuum I suggested is waaay too aggressive for any
> scheduled vacuum isn't it?

Not for some cases.  In (say) 40% write situation, you have _lots_ of
dead tuples.  Perhaps you can make the application more efficient,
but that's not always an option (maybe you don't have the code).

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-21 Thread Andrew Sullivan
On Thu, Aug 21, 2003 at 12:05:28PM -0400, Edmund Dengler wrote:
> Well, if they are locked waiting on vacuum, then vacuum should upgrade
> it's priority to the highest waiting process (priority inheritance).
> This way, vacuum will be running at a priority level equivalent to who is
> waiting on it.

Right, but all that intelligence is something that isn't in there
now.  And anyway, the real issue is I/O, not processor.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Buglist

2003-08-21 Thread Edmund Dengler
Well, if they are locked waiting on vacuum, then vacuum should upgrade
it's priority to the highest waiting process (priority inheritance).
This way, vacuum will be running at a priority level equivalent to who is
waiting on it.

Regards,
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

> On Wed, Aug 20, 2003 at 11:41:41PM +0200, Karsten Hilbert wrote:
> > You mean, like, "nice 19" or so ?
>
> ISTR someone reporting problems with locking on the performance list
> from doing exactly that.  The problem is that the vacuum back end
> might take a lock and then not get any processor time -- in which
> case everybody else gets their processor slice but can't do anything,
> because they have to wait until the niced vacuum process gets back in
> line.
>
> A
>
> --
> 
> Andrew Sullivan 204-4141 Yonge Street
> Liberty RMS   Toronto, Ontario Canada
> <[EMAIL PROTECTED]>  M2P 2A8
>  +1 416 646 3304 x110
>
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Buglist

2003-08-21 Thread Shridhar Daithankar
On 21 Aug 2003 at 11:26, Andrew Sullivan wrote:

> On Thu, Aug 21, 2003 at 08:38:14PM +0530, Shridhar Daithankar wrote:
> > If a database is clean i.e. no dead tuple, an autovacuum daemon with 1 min 
> > interval can achieve pretty much same result, isn't it?
> 
> But we're talking about the case of large, busy databases that have
> already choked their disks.  We have the same problem here in our
> test machines.  We start running load tests, and with vacuums nicely
> scheduled and everything we start topping out on the performance
> pretty quickly, because of I/O bottlenecks on the database.  We know
> the difference in I/O bandwidth between our test env. and the
> production env., so we can put in a fudge factor for this; but that's
> it.

Well, nothing can help if the database has dead tuples already. Sometime 
somebody has to take time to run vacuum full and/or database reload to get a 
clean state.

Point I am trying to make is to tune FSM and autovacuum frequency such that you 
catch all the dead tuples in RAM, which is non-blocking operation at the 
expense of some CPU power. I am sure 1 min autovacuum I suggested is waaay too 
aggressive for any scheduled vacuum isn't it?

It would be really good if vacuum analyse gets lock only for pages it's 
dealing. That way there would be minimum impact on rest of the system. I don't 
know how it is done as of now.

Ideally a vacuum analyze could run in tight loops wasting minimum CPU. But that 
is like making two poles of earth hug each other..

Bye
 Shridhar

--
Bolub's Fourth Law of Computerdom:  Project teams detest weekly progress 
reporting because it so vividly manifests their lack of progress.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Buglist

2003-08-21 Thread Andrew Sullivan
On Thu, Aug 21, 2003 at 08:38:14PM +0530, Shridhar Daithankar wrote:
> If a database is clean i.e. no dead tuple, an autovacuum daemon with 1 min 
> interval can achieve pretty much same result, isn't it?

But we're talking about the case of large, busy databases that have
already choked their disks.  We have the same problem here in our
test machines.  We start running load tests, and with vacuums nicely
scheduled and everything we start topping out on the performance
pretty quickly, because of I/O bottlenecks on the database.  We know
the difference in I/O bandwidth between our test env. and the
production env., so we can put in a fudge factor for this; but that's
it.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-21 Thread Shridhar Daithankar
On 21 Aug 2003 at 11:01, Andrew Sullivan wrote:

> On Thu, Aug 21, 2003 at 03:40:29PM +1000, Martijn van Oosterhout wrote:
> > Given lazy vacuum doesn't hold locks for long periods, it could be
> > an idea to continuously spend 1% of your disk bandwidth on a
> > background vacuum. As for vacuum full, I don't know if you could do
> > the same thing.
> 
> Assuming that one can keep up with the dust bunnies this way, though,
> one wouldn't need to do vacuum full.  This would definitely be a way
> cool feature, if implementable.

If a database is clean i.e. no dead tuple, an autovacuum daemon with 1 min 
interval can achieve pretty much same result, isn't it?

Bye
 Shridhar

--
Drew's Law of Highway Biology:  The first bug to hit a clean windshield lands 
directly in front   of your eyes.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Buglist

2003-08-21 Thread Shridhar Daithankar
On 21 Aug 2003 at 10:59, Andrew Sullivan wrote:

> On Wed, Aug 20, 2003 at 05:58:32PM -0400, Tom Lane wrote:
> > Jan Wieck <[EMAIL PROTECTED]> writes:
> > > the LRU chain but rather at it's end? This way a vacuum on a large table 
> > > will not cause a complete cache eviction.
> > 
> > I think what we really need is a way to schedule VACUUM's I/O at a lower
> > priority than normal I/Os.  Wouldn't be very portable :-( ... but if the
> 
> Hey, they both sounds like nifty ideas to me!  The portability sure
> worries me, though, on that I/O trick.  Still, Oracle (f'rinstance)
> made all kinds of optimisations for Sun (and conversely) partly
> because, I expect, that's where a lot of their users were, and the
> performance or reliability gains were significant.  Whether that is
> worth doing for PostgreSQL, when there are probably lots of other
> targets to aim at, is an open question.

Well, if you guys remember my posts on performance recently, the said project 
will probably drift to mysql as performance requirement on solaris platform 
seems pretty steep to postgresql.

Personally I think inserting 5M rows in 11 column table should not take more 
than an hour. ( That's the performance criteria). But apparently postgresql is 
not making it no matter what..

Just an FYI, I think we need to do something for solaris. If a hourse does not 
drink despite being taken to water, throw him in water.. After it's the 
database users who are stuck. Not Sun..

Bye
 Shridhar

--
Fun Facts, #14: In table tennis, whoever gets 21 points first wins.  That's how 
it once was in baseball -- whoever got 21 runs first won.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Buglist

2003-08-21 Thread Andrew Sullivan
On Wed, Aug 20, 2003 at 11:41:41PM +0200, Karsten Hilbert wrote:
> You mean, like, "nice 19" or so ?

ISTR someone reporting problems with locking on the performance list
from doing exactly that.  The problem is that the vacuum back end
might take a lock and then not get any processor time -- in which
case everybody else gets their processor slice but can't do anything,
because they have to wait until the niced vacuum process gets back in
line.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-21 Thread Andrew Sullivan
On Thu, Aug 21, 2003 at 03:40:29PM +1000, Martijn van Oosterhout wrote:
> Given lazy vacuum doesn't hold locks for long periods, it could be
> an idea to continuously spend 1% of your disk bandwidth on a
> background vacuum. As for vacuum full, I don't know if you could do
> the same thing.

Assuming that one can keep up with the dust bunnies this way, though,
one wouldn't need to do vacuum full.  This would definitely be a way
cool feature, if implementable.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Buglist

2003-08-21 Thread Andrew Sullivan
On Wed, Aug 20, 2003 at 05:58:32PM -0400, Tom Lane wrote:
> Jan Wieck <[EMAIL PROTECTED]> writes:
> > the LRU chain but rather at it's end? This way a vacuum on a large table 
> > will not cause a complete cache eviction.
> 
> I think what we really need is a way to schedule VACUUM's I/O at a lower
> priority than normal I/Os.  Wouldn't be very portable :-( ... but if the

Hey, they both sounds like nifty ideas to me!  The portability sure
worries me, though, on that I/O trick.  Still, Oracle (f'rinstance)
made all kinds of optimisations for Sun (and conversely) partly
because, I expect, that's where a lot of their users were, and the
performance or reliability gains were significant.  Whether that is
worth doing for PostgreSQL, when there are probably lots of other
targets to aim at, is an open question.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Buglist

2003-08-20 Thread Edmund Dengler
What about the use of priority inheritance to deal with the issue of
priority inversion (a standard methodology within the real-time world)?

Then we could have priorities, but still have low priority processes
bumped up if a high level one is waiting on them.

Regards,
Ed

On Wed, 20 Aug 2003, Tom Lane wrote:

> Andrew Sullivan <[EMAIL PROTECTED]> writes:
> >> I disagree.  Triggering a vacuum on a db that is nearly saturating the
> >> disk bandwidth has a significant impact.
>
> > Vivek is right about this.  If your system is already very busy, then
> > a vacuum on a largish table is painful.
>
> > I don't actually think having the process done in real time will
> > help, though -- it seems to me what would be more useful is an even
> > lazier vacuum: something that could be told "clean up as cycles are
> > available, but make sure you stay out of the way."  Of course, that's
> > easy to say glibly, and mighty hard to do, I expect.
>
> I'd love to be able to do that, but I can't think of a good way.
>
> Just nice'ing the VACUUM process is likely to be counterproductive
> because of locking issues (priority inversion).  Though if anyone cares
> to try it on a heavily-loaded system, I'd be interested to hear the
> results...
>
>   regards, tom lane
>
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
>


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Buglist

2003-08-20 Thread Karsten Hilbert
> it seems to me what would be more useful is an even
> lazier vacuum: something that could be told "clean up as cycles are
> available, but make sure you stay out of the way."  Of course, that's
> easy to say glibly, and mighty hard to do, I expect.
You mean, like, "nice 19" or so ?

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Buglist

2003-08-20 Thread Tom Lane
Andrew Sullivan <[EMAIL PROTECTED]> writes:
>> I disagree.  Triggering a vacuum on a db that is nearly saturating the
>> disk bandwidth has a significant impact.

> Vivek is right about this.  If your system is already very busy, then
> a vacuum on a largish table is painful.

> I don't actually think having the process done in real time will
> help, though -- it seems to me what would be more useful is an even
> lazier vacuum: something that could be told "clean up as cycles are
> available, but make sure you stay out of the way."  Of course, that's
> easy to say glibly, and mighty hard to do, I expect.

I'd love to be able to do that, but I can't think of a good way.

Just nice'ing the VACUUM process is likely to be counterproductive
because of locking issues (priority inversion).  Though if anyone cares
to try it on a heavily-loaded system, I'd be interested to hear the
results...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-20 Thread Jan Wieck
Andrew Sullivan wrote:

On Wed, Aug 20, 2003 at 12:40:03PM -0400, Vivek Khera wrote:
> "BW" == Bruno Wolff,  writes:
BW> Also, since at least 7.3, normal vacuums aren't normally going to
BW> affect the performance of your database server that much.
I disagree.  Triggering a vacuum on a db that is nearly saturating the
disk bandwidth has a significant impact.
Vivek is right about this.  If your system is already very busy, then
a vacuum on a largish table is painful.
I don't actually think having the process done in real time will
help, though -- it seems to me what would be more useful is an even
lazier vacuum: something that could be told "clean up as cycles are
available, but make sure you stay out of the way."  Of course, that's
easy to say glibly, and mighty hard to do, I expect.
What about a little hint to the buffer management that if it has to 
evict another buffer to physically read this one (meaning the buffer 
pool was full already) then it will not put this buffer at the top of 
the LRU chain but rather at it's end? This way a vacuum on a large table 
will not cause a complete cache eviction.

Might be a useful hint for sequential scans too.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [GENERAL] Buglist

2003-08-20 Thread Andrew Sullivan
On Wed, Aug 20, 2003 at 12:40:03PM -0400, Vivek Khera wrote:
> > "BW" == Bruno Wolff,  writes:
> BW> Also, since at least 7.3, normal vacuums aren't normally going to
> BW> affect the performance of your database server that much.
> 
> I disagree.  Triggering a vacuum on a db that is nearly saturating the
> disk bandwidth has a significant impact.

Vivek is right about this.  If your system is already very busy, then
a vacuum on a largish table is painful.

I don't actually think having the process done in real time will
help, though -- it seems to me what would be more useful is an even
lazier vacuum: something that could be told "clean up as cycles are
available, but make sure you stay out of the way."  Of course, that's
easy to say glibly, and mighty hard to do, I expect.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Buglist

2003-08-20 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes:
> Aah - there is the first bullet hole in my multi-ctid-index-idea. Now 
> the question becomes how expensive these tests are (if a normal backend 
> can do them at all within reason)?

It's not hugely expensive, IIRC, you just need to make some additional
checks against global xmin (compare HeapTupleSatisfiesVacuum against
the others).  We're already doing something similar for the optimization
that suppresses subsequent heap lookups for globally-dead index tuples.

I'm dubious about the multi-ctid idea though because it would mean
bloating the index tuple header, whether there was any use for multiple
entries or not.  (Could we make the header variable size?  Not sure it's
worth the trouble.)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-20 Thread Jan Wieck
Tom Lane wrote:

Recall also that "committed deleted" does not mean "safe to remove".
There may still be live transactions that could see the tuple.  The
"committed deleted" bit just exists to allow subsequent visitors to the
row to skip one of the more expensive steps in deciding whether they can
see it or not.
To determine that a row is removable requires additional tests above and
beyond what backends normally do.
Aah - there is the first bullet hole in my multi-ctid-index-idea. Now 
the question becomes how expensive these tests are (if a normal backend 
can do them at all within reason)?

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Buglist

2003-08-20 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes:
> True, but the message being responded to was specifically "if the backend
> were to do the checking for external references upon updating/deleting a
> row".

It's clearly impossible for a backend to remove a row immediately upon
updating/deleting it, since it cannot know whether it will succeed in
committing its transaction.  The implementable variant of this would
correspond to extending the check-whether-committed-deleted code to see
whether a previously deleted tuple is now removable --- that is, moving
VACUUM's processing of the tuple into the main line.

> In any case, I thought it only does the committed deleted stuff
> when it comes upon a row in a scan, which means that it's still not
> automatic clean up in general since any particular deleted row may not get
> looked at for some amount of time after all possible viewers are gone.

Recall also that "committed deleted" does not mean "safe to remove".
There may still be live transactions that could see the tuple.  The
"committed deleted" bit just exists to allow subsequent visitors to the
row to skip one of the more expensive steps in deciding whether they can
see it or not.

To determine that a row is removable requires additional tests above and
beyond what backends normally do.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Buglist

2003-08-20 Thread Tom Lane
Vivek Khera <[EMAIL PROTECTED]> writes:
> "JW" == Jan Wieck <[EMAIL PROTECTED]> writes:
> JW> remove all the index entries pointing to these ctid's. Your idea is (so 
> JW> far) lacking a place where to remember all the single removed rows and I
> JW> assume you're not planning to pay the cost of a full scan over all 
> JW> indexes of a table to reclaim the space of one data row, are you?

> Well, that pretty much kills my idea...  back to autovacuum ;-)

In addition to the index-cleanup issue that Jan explained, there are
locking problems.  The tuple-is-dead hint bit mechanism is very
carefully designed so that a backend can set the hint bits while holding
only a shared lock on the page containing the tuple.  Physically
removing a tuple requires a far stronger lock (see the source code for
details).  Thus, having ordinary interactive backends remove tuples
would have bad consequences for concurrent performance.

But I think the real point here is that there's no reason to think that
doing tuple deletion on-the-fly in foreground transactions is superior
to doing it in background with a vacuum process.  You're taking what
should be noncritical maintenance work and moving it into the critical
paths of your foreground applications.  Not only that, but you're
probably doing more total work per tuple --- VACUUM "batches" its work
in more ways than just the index cleanup aspect, IIRC.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-20 Thread scott.marlowe
On Wed, 20 Aug 2003, Vivek Khera wrote:

> > "BW" == Bruno Wolff,  writes:
> 
> >> to see it incremental.  This would result in pretty much near zero
> >> internal fragmentation, I think.
> 
> BW> Why do you care about about the details of the implementation (rather than
> BW> the performance)? If it were faster to do it that way, that's how it would
> BW> have been done in the first place. The cost of doing the above is almost
> BW> certainly going to be an overall performance loser.
> 
> I care for the performance.  And how are you so sure that it was
> faster the way it is now?  Are you sure it was not done this way
> because of ease of implementation?
> 
> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

Time for a mental exercise.

Our server has 2 users.  Each backend has to check with all the other 
backends when it deletes a tuple (every update is also a delete, remember 
every change in an MVCC database is a create / delte cycle.)  Let's create 
a name for the time it takes to do the update / mark deleted versus the 
time it takes to contact each of those other backends.  Tw is the Time to 
do the work here, and Tc is the time to do the cleanup (i.e. vacuum the 
tuple)  Note that we'd also need a Ta for answering the requests of all 
the other backends, but we can assume that on average, for each request a 
child process makes, it will receive exactly that many from each other 
backend running.  Let x represent the number of backends.  So the answer 
time is equal to x*Tc

Time = Tw + Tc + Ta

Time = Tw + Tc + (x * Tc)

Time = Tw + ((x+1) * Tc)

and our cleanup time starts to grow at an ugly rate as the number of 
backends increases.  Lazy vacuuming allows the database to reclaim lost 
space in the background, as the newer non-full vacuum does.

Many folks mistake this vacuum process for its older, slower cousin, full 
vacuum, which does eat a lot more disk bandwidth and slow the machine 
down.

On a Dual CPU X86 box a lazy vacuum running in a continuous loop will eat 
about 5% of one CPU and drop pgbench scores by 10 to 15%.  The important 
thing here, is that the machine will still run quite snappily when you 
throw several hundred clients at it, since the lazy vacuum just sits in 
the background using the spare cycles and not much more.

that means your storage usage may baloon somewhat under intense usage, but 
you won't have an IPC storm kill the performance of the postgresql server.  

Knowing the postgresql development team, I'm sure the reasons they chose 
are clearly stated in the hackers mailing list somewhere in time, so I'm 
gonna go look, but trust me on one thing, the guys programming this 
database don't do much because it's easier / faster to implement without 
putting something in the TODO list about making it better some day.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-20 Thread Alvaro Herrera
On Wed, Aug 20, 2003 at 10:31:25AM -0400, Vivek Khera wrote:

> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

There's no way to check for "external references", because said
references are actually the running transactions.  So you can't drop a
row until all the transactions that were active during your transaction
are finished.  Certainly your own backend can't do the checking, because
there's no way to even assert that it will be live when those other
transactions finish.  Who will?  The last of said transactions?  It
certainly will be expensive for a backend to keep track of the
deleted/updated tuples by all _other_ backends, just in case...
Long running transactions can't be neglected, so you can't keep it in
memory.

INVHO the solution to this problem will come in the form of a
autovaccum daemon integrated into the backend...

-- 
Alvaro Herrera ()
"Ninguna manada de bestias tiene una voz tan horrible como la humana" (Orual)

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Buglist

2003-08-20 Thread Bruno Wolff III
On Wed, Aug 20, 2003 at 10:31:25 -0400,
  Vivek Khera <[EMAIL PROTECTED]> wrote:
> 
> I care for the performance.  And how are you so sure that it was
> faster the way it is now?  Are you sure it was not done this way
> because of ease of implementation?
> 
> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

It would probably be a lot slower. Any transaction that has started
but not yet finished would need to lock all rows that exist at during
the transaction (for serialized transaction isolation you would only
need to worry about rows that existed at the start of the transaction
or that were modified by the transaction). Immediately that is a big
deal since a typical query may need to lock a bunch of rows that it
will never actually touch (but you don't know that when the transaction
starts). Managing all those locks would take up a lot of semiconductor 
memory or a lot of disk writes and be a major source of contention.
The current system just has to mark rows when they are created and
deleted (an update does both operations). The normal vacuum clean up
actually isn't going to be much worse than what you would need to do
at both the start and end of each transaction. The overhead of letting
dead rows hang around for a while after they aren't needed isn't that high.

Also, since at least 7.3, normal vacuums aren't normally going to affect the
performance of your database server that much.

The main issue against the current vacuum system is that it requires the
DBA knowing what vacuum does and figuring out how it should be used in
their situation to get reasonable performance. This makes it a bit harder
for non-DBAs to jump right in to Postgres without running into problems.

However, the work on autovacuum seems to be providing a reasonable solution
to that problem.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-20 Thread Vivek Khera
> "BW" == Bruno Wolff,  writes:

>> to see it incremental.  This would result in pretty much near zero
>> internal fragmentation, I think.

BW> Why do you care about about the details of the implementation (rather than
BW> the performance)? If it were faster to do it that way, that's how it would
BW> have been done in the first place. The cost of doing the above is almost
BW> certainly going to be an overall performance loser.

I care for the performance.  And how are you so sure that it was
faster the way it is now?  Are you sure it was not done this way
because of ease of implementation?

Seriously, how much slower can it be if the backend were to do the
checking for external references upon updating/deleting a row?  The
cost would be distributed across time as opposed to concentrated at
once within a vacuum process.  I am fairly certian it would reduce
disk bandwidth requirements since at least one necessary page will
already be in memory.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-20 Thread Bruno Wolff III
On Tue, Aug 19, 2003 at 21:51:14 -0400,
  Vivek Khera <[EMAIL PROTECTED]> wrote:
> 
> I'm not promoting any change in the MVCC.  What I'm saying is that it
> would be really cool if the backend process itself could recognize
> that a row is no longer referenced by any transactions upon
> termination of the transaction, and release it back to the system.
> This is just what vacuum does but in a batched manner.  I would love
> to see it incremental.  This would result in pretty much near zero
> internal fragmentation, I think.

Why do you care about about the details of the implementation (rather than
the performance)? If it were faster to do it that way, that's how it would
have been done in the first place. The cost of doing the above is almost
certainly going to be an overall performance loser.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-19 Thread Vivek Khera
> "MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:

MTO> 
MTO> Changing Postgres to perform as mentioned above is non-trivial, it would
MTO> basicially change the entire core of the system.  I think this is due to
MTO> the fact that postgres uses a non-overwriting storage manager.  This has
MTO> many benefits including MVCC, the primary disadvantage is that you need
MTO> a vacuum type process
MTO> 

I'm not promoting any change in the MVCC.  What I'm saying is that it
would be really cool if the backend process itself could recognize
that a row is no longer referenced by any transactions upon
termination of the transaction, and release it back to the system.
This is just what vacuum does but in a batched manner.  I would love
to see it incremental.  This would result in pretty much near zero
internal fragmentation, I think.

How hard that is, I have no clue.  Right now my DB is saturating the
disk and having to squeeze vacuum into a saturated disk bandwidth is
not pleasant.  Luckily, the 14-disk raid array just
arrived... hopefully that will have higher bandwidth than the 4-disk
array... ;-)



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-19 Thread Matthew T. O'Connor
On Tue, 2003-08-19 at 12:13, Vivek Khera wrote:
> There's a big difference between "noticing that a table needs to be
> vacuumed and running it" and "automatically having the backend free a
> row as soon as we know it is eligible to be (as would normally be
> determined by vacuum)".


Changing Postgres to perform as mentioned above is non-trivial, it would
basicially change the entire core of the system.  I think this is due to
the fact that postgres uses a non-overwriting storage manager.  This has
many benefits including MVCC, the primary disadvantage is that you need
a vacuum type process


> One of these days when I can afford a 14-hour dump/restore, I'll
> upgrade to 7.4 and try autovacuum :-)

pg_autovacuum does with with 7.3.x, but the source is only included in
7.4.  Just get the pg_autovacuum directory from contrib and use it.

Matthew


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Buglist

2003-08-19 Thread Ian Barwick
On Tuesday 19 August 2003 23:10, scott.marlowe wrote:
> On 19 Aug 2003, Bo Lorentsen wrote:
> > On Tue, 2003-08-19 at 18:17, Vivek Khera wrote:
> > > Since the beginning of time (at least MySQL v3.22) MySQL has silently
> > > ignored the foreign key references in table create statement.  Now
> > > that they have foreign key support (version 4.x), do they honor those
> > > statements?  Nope.  You have to use their own syntax to declare your
> > > FKs.  They still silently ignore the references in the table create
> > > statements.
> >
> > Is this really true ?? Does 4.x still not support FK, then how about
> > transactions, does they that not work too ?
> >
> > Is this not just the MyISAM tables that still got the problem (they are
> > verison 3.x) ?
>
> No, the problem is that in SQL spec, you do it with the foreign key
> declaration inside parnes in the create statement like:
>
> create table abc123
> (
>   id serial unique,
>   info text);
> create table abc1234
> (
>   moreinfo text,
>   ref_id int,
>   foreign key (ref_id)
>   references abc123(id)
>   on delete cascade
> );
>
> In MySQL this syntax is silently swallowed, while their own "proper"
> syntax is like this:
>
> create table abc123
> (
>   id serial unique,
>   info text)
> type=innodb;
> create table abc1234
> (
>   moreinfo text,
>   ref_id int)
> foreign key (ref_id) references abc123(id)
> on delete CASCADE
> type=innodb;

(To be precise this will fail with an obscure message; an
index must be created on ref_id)

> So the syntaxes are different, and one is apparently swallowed without
> error or anything, but in fact you have no fks in place.

Just to confuse things further:
1: if the MySQL version running is not configured for innodb tables,
tables created with type=innodb will be silently converted to
MyISAM;

2: These statements will succeed:
create table abc123 (
  id INT unique,
  info text
) type=innodb;

create table abc1234  (
  moreinfo text,
  ref_id int REFERENCES abc123(id)
)  type=innodb;

but the foreign key defined on ref_id is (I presume)
transported to a remote forest in Sweden and eaten
by goats ;-)

Ian Barwick
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Buglist

2003-08-19 Thread Bo Lorentsen
On Tue, 2003-08-19 at 23:10, scott.marlowe wrote:

> So the syntaxes are different, and one is apparently swallowed without 
> error or anything, but in fact you have no fks in place.
Thanks, that helped.

/BL


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-19 Thread scott.marlowe
On 19 Aug 2003, Bo Lorentsen wrote:

> On Tue, 2003-08-19 at 18:17, Vivek Khera wrote:
> 
> > Since the beginning of time (at least MySQL v3.22) MySQL has silently
> > ignored the foreign key references in table create statement.  Now
> > that they have foreign key support (version 4.x), do they honor those
> > statements?  Nope.  You have to use their own syntax to declare your
> > FKs.  They still silently ignore the references in the table create
> > statements.
> Is this really true ?? Does 4.x still not support FK, then how about
> transactions, does they that not work too ?
> 
> Is this not just the MyISAM tables that still got the problem (they are
> verison 3.x) ?

No, the problem is that in SQL spec, you do it with the foreign key 
declaration inside parnes in the create statement like:

create table abc123 
(
  id serial unique, 
  info text);
create table abc1234 
(
  moreinfo text, 
  ref_id int, 
  foreign key (ref_id) 
  references abc123(id)
  on delete cascade
);

In MySQL this syntax is silently swallowed, while their own "proper" 
syntax is like this:

create table abc123 
(
  id serial unique, 
  info text)
type=innodb;
create table abc1234 
(
  moreinfo text, 
  ref_id int)
foreign key (ref_id) references abc123(id)
on delete CASCADE
type=innodb;

So the syntaxes are different, and one is apparently swallowed without 
error or anything, but in fact you have no fks in place.



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-19 Thread Joshua D. Drake


It's still bolted on.  The entire concept that "transactional integrity
is optional" is ludicrous, IMHO.  "Integrity" and "optional" are
contradictory.
Obviously you have never voted in a major election ;)

			regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html
 



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-19 Thread Bruno Wolff III
On Tue, Aug 19, 2003 at 19:17:31 +0530,
  Shridhar Daithankar <[EMAIL PROTECTED]> wrote:
> 
> Making pgsql-bugs a open to non-subscription but moderated list might be a 
> good idea. It really does not matter if a bug gets filed couple of days late 
> but having to have subscribe to another list could be ditterent.

All of the pgsql lists including bugs already work this way.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-19 Thread Shridhar Daithankar
On Tuesday 19 August 2003 21:03, Vivek Khera wrote:
> Tops on my wish list is that postgres automatically notice when a row
> is no longer needed (all transactional references to it are gone) and
> 'free' it at that time, rather then needing a special scan to
> determine the row is no longer needed and freeing it.

Heh.. we have autovacuum right. Well it does not work the way you want but it 
works automatically at least.

Couple of releases down the line it will be good enough and vacuum problems 
will be history if people take care to setup autovacuum and FSM parameters 
correctly..

 Shridhar


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Buglist

2003-08-19 Thread Vivek Khera
> "BL" == Bo Lorentsen <[EMAIL PROTECTED]> writes:

BL> Hi ...
BL> I'm trying to convince my boss to use posgresql (I need RI, transactions
BL> and views), but he keeps comparing the project to mysql. Until now, I
BL> found the answers to he's questions on the www.postgresql.org page, but
BL> now I'm lost :-)

My big reason to choose postgres was concurrency.  My application has
transactions and updates all over the place, and even as a single
developer in the early stages, I could already see problems with
table-level locking that mysql was giving me.  Who knows what would
have happened in production with hundreds of people hitting the db
simultaneously!  The row-level locking in Postgres has made it
possible for an incredible number of simultaneous actions to be
carried out without any waiting for the users.

Try making a table grow beyond your file size limit in mysql.  You
can't.  Even if you use an OS with 64-bit file pointers (such as
FreeBSD) you can't grow your file beyond 4Gb in mysql (at least with
mysam tables -- dunno about innodb tables).  In Postgres, it is
automagically handled for you.

The *only* drawbacks I find with postgres is the need to dump/restore
on major version updates and the need to vacuum the tables
regularly...

Tops on my wish list is that postgres automatically notice when a row
is no longer needed (all transactional references to it are gone) and
'free' it at that time, rather then needing a special scan to
determine the row is no longer needed and freeing it.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Buglist

2003-08-19 Thread Bo Lorentsen
On Tue, 2003-08-19 at 16:20, Lincoln Yeoh wrote:

> Install an application that can use both DBs. Muck around with it. If you 
> can't tell the difference, then I'd say go with postgresql - transactions 
> isn't bolted on, quite a number of other design wins too. If you can tell 
> the difference and MySQL is better, many of us here would be interested to 
> know.
Ok, thanks, we may need to make a test utility, that is open and fair,
but thats a little hard, as PG have some featurs that MySQL does not.

> Do lots of concurrent updates and inserts and selects for a long time?
I do think I know why you say this :-)

> Have fun!
I like to do this, but I'm not sure that I have the time needed. If I
have the time, and I get some results, I let you now.

/BL


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Buglist

2003-08-19 Thread Bo Lorentsen
On Tue, 2003-08-19 at 16:03, Tom Lane wrote:

> It's still bolted on.  The entire concept that "transactional integrity
> is optional" is ludicrous, IMHO.  "Integrity" and "optional" are
> contradictory.
Good point. Also the problem of MyISAM and InnoDB RI :-)

> One thing you should ask about MySQL is where they keep the system's
> metadata (catalog data).  In Postgres it's under transactional control
> just like everything else, which means it's (a) crash-safe and (b)
> rollback-able.  This is why all DDL changes are rollback-able in PG.
> I honestly don't know what the corresponding arrangements are in MySQL
> ... but I suspect that even in an all-InnoDB database, there is critical
> system data that is outside the InnoDB table handler and thus not
> transaction-safe.
Thats a really nice thing for temporary tables, but "point in time"
backup is a much stonger argument :-)

/BL


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Buglist

2003-08-19 Thread Devrim GUNDUZ

Hi,

On Tue, 19 Aug 2003, Lincoln Yeoh wrote:

> AFAIK bugzilla requires mysql (for now).
> 
> I've recently installed it and if it can be easily made to work with 
> postgresql I'd like to know.

https://bugzilla.redhat.com/bugzilla/index.cgi
 Bugzilla News
===
January 1st, 2003
Current Red Hat version of Bugzilla using PostgreSQL code available for 
download.


AFAIK RH runs bugzilla on PostgreSQL (or RHDB, whatever). The code is 
available from there.

Regards,



-- 
Devrim GUNDUZ
[EMAIL PROTECTED]   [EMAIL PROTECTED] 
http://www.tdmsoft.com
http://www.gunduz.org


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Buglist

2003-08-19 Thread Lincoln Yeoh
At 03:59 PM 8/19/2003 +0200, Bo Lorentsen wrote:

On Tue, 2003-08-19 at 15:47, Shridhar Daithankar wrote:

> Or have bugzilla setup somewhere. That way the tracking will be hell lot
> visible to outside world..
I agree on this, as it seems messy from outside not to be able to get an
overview of both solved and not solved bugs.
I know that as developer, this may not seem like a big problem, but it
will help non hackers to get an overview.
AFAIK bugzilla requires mysql (for now).

I've recently installed it and if it can be easily made to work with 
postgresql I'd like to know.

Link.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org