Re: [HACKERS] switch UNLOGGED to LOGGED

2011-05-29 Thread Noah Misch
On Sat, May 28, 2011 at 09:33:09PM -0400, Robert Haas wrote:
 On Fri, May 27, 2011 at 6:19 AM, Noah Misch n...@leadboat.com wrote:
  So, it's ok to have a log item that is replayed only if
 
  WalRcvInProgress()
 
  is true?
 
  No, that checks for WAL streaming in particular. ?A log-shipping standby 
  needs
  the same treatment.
 
  Is it a correct approach? I couldn't find any other way to
  find out if we are in a standby or a master...
 
  InArchiveRecovery looks like the right thing, but it's currently static to
  xlog.c. ?Perhaps exporting that is the way to go.
 
 Why is it necessary to replay the operation only on the slave?  Can we
 just use XLOG_HEAP_NEWPAGE?

I don't think it is *necessary*.  If we're replaying WAL on a master, we'll also
be resetting unlogged relations after recovery; what we write or do not write to
them in the mean time has no functional impact.  Seemed like a sensible
optimization, but maybe it's premature.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Cédric Villemain
2011/5/29 Tom Lane t...@sss.pgh.pa.us:
 Greg Stark gsst...@mit.edu writes:
 On Sat, May 28, 2011 at 12:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I also found that Greg was right in thinking that it would help if we
 tweaked lazy_scan_heap to not always scan the first
 SKIP_PAGES_THRESHOLD-1 pages even if they were
 all_visible_according_to_vm.

 You fixed the logic only for the first 32 pages which helps with the
 skew. But really the logic is backwards in general. Instead of
 counting how many missed opportunities for skipped pages we've seen in
 the past we should read the bits for the next 32 pages in advance and
 decide what to do before we read those pages.

 OK, do you like the attached version of that logic?  (Other fragments
 of the patch as before.)

The idea was that remove only one page from the VACUUM will prevent
relfrozenxid update and reltuples (and relpages) update.
Now, I beleive that once we've skip at least one page thanks to
SKIP_PAGES_THRESHOLD, then we should be more agressive and remove as
many as possible pages from the VACUUM, tks to the VM.



                        regards, tom lane

 diff --git a/src/backend/commands/vacuumlazy.c 
 b/src/backend/commands/vacuumlazy.c
 index 
 9393fa0727aaad7508e1163623322b4066412257..231447b31223bc5350ce49a136cffafaa53bc5fb
  100644
 *** a/src/backend/commands/vacuumlazy.c
 --- b/src/backend/commands/vacuumlazy.c
 *** lazy_scan_heap(Relation onerel, LVRelSta
 *** 311,317 
        int                     i;
        PGRUsage        ru0;
        Buffer          vmbuffer = InvalidBuffer;
 !       BlockNumber all_visible_streak;

        pg_rusage_init(ru0);

 --- 305,312 
        int                     i;
        PGRUsage        ru0;
        Buffer          vmbuffer = InvalidBuffer;
 !       BlockNumber next_not_all_visible_block;
 !       bool            skipping_all_visible_blocks;

        pg_rusage_init(ru0);

 *** lazy_scan_heap(Relation onerel, LVRelSta
 *** 329,340 

        nblocks = RelationGetNumberOfBlocks(onerel);
        vacrelstats-rel_pages = nblocks;
        vacrelstats-nonempty_pages = 0;
        vacrelstats-latestRemovedXid = InvalidTransactionId;

        lazy_space_alloc(vacrelstats, nblocks);

 !       all_visible_streak = 0;
        for (blkno = 0; blkno  nblocks; blkno++)
        {
                Buffer          buf;
 --- 324,369 

        nblocks = RelationGetNumberOfBlocks(onerel);
        vacrelstats-rel_pages = nblocks;
 +       vacrelstats-scanned_pages = 0;
        vacrelstats-nonempty_pages = 0;
        vacrelstats-latestRemovedXid = InvalidTransactionId;

        lazy_space_alloc(vacrelstats, nblocks);

 !       /*
 !        * We want to skip pages that don't require vacuuming according to the
 !        * visibility map, but only when we can skip at least 
 SKIP_PAGES_THRESHOLD
 !        * consecutive pages.  Since we're reading sequentially, the OS 
 should be
 !        * doing readahead for us, so there's no gain in skipping a page now 
 and
 !        * then; that's likely to disable readahead and so be 
 counterproductive.
 !        * Also, skipping even a single page means that we can't update
 !        * relfrozenxid, so we only want to do it if we can skip a goodly 
 number
 !        * of pages.
 !        *
 !        * Before entering the main loop, establish the invariant that
 !        * next_not_all_visible_block is the next block number = blkno that's
 !        * not all-visible according to the visibility map, or nblocks if 
 there's
 !        * no such block.  Also, we set up the skipping_all_visible_blocks 
 flag,
 !        * which is needed because we need hysteresis in the decision: once 
 we've
 !        * started skipping blocks, we may as well skip everything up to the 
 next
 !        * not-all-visible block.
 !        *
 !        * Note: if scan_all is true, we won't actually skip any pages; but we
 !        * maintain next_not_all_visible_block anyway, so as to set up the
 !        * all_visible_according_to_vm flag correctly for each page.
 !        */
 !       for (next_not_all_visible_block = 0;
 !                next_not_all_visible_block  nblocks;
 !                next_not_all_visible_block++)
 !       {
 !               if (!visibilitymap_test(onerel, next_not_all_visible_block,   
   vmbuffer))
 !                       break;
 !       }
 !       if (next_not_all_visible_block = SKIP_PAGES_THRESHOLD)
 !               skipping_all_visible_blocks = true;
 !       else
 !               skipping_all_visible_blocks = false;
 !
        for (blkno = 0; blkno  nblocks; blkno++)
        {
                Buffer          buf;
 *** lazy_scan_heap(Relation onerel, LVRelSta
 *** 347,387 
                OffsetNumber frozen[MaxOffsetNumber];
                int                     nfrozen;
                Size            freespace;
 !               bool            all_visible_according_to_vm = false;
                bool            all_visible;
 

Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Stefan Kaltenbrunner
On 05/29/2011 06:04 AM, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, May 28, 2011 at 11:23 PM, Greg Sabino Mullane g...@turnstep.com 
 wrote:
 My own bare bones wish list for such a tracker is:

 * Runs on Postgres
 * Has an email interface

 Make no mistake, whichever we choose, the care of feeding of such a
 beast will require some precious resources in time from at least two
 people, probably more. If there is anyone in the community that
 wants to help the project but hasn't found a way, this is your chance
 to step up! :)
 
 Yeah, agreed.  My basic requirements are:
 
 1. Given a bug number, find the pgsql-bugs emails that mention it in
 the subject line.  Note that the archives would actually MOSTLY do
 this ,but for the stupid month-boundary problem which we seem unable
 to fix despite having some of the finest engineers in the world.
 
 Many, many, many bug issues are not associated with a bug report
 submitted through the web interface.  People mail stuff to pgsql-bugs
 manually, or issues turn up in threads on other lists.  If a tracker
 can only find things submitted through the web interface, that is not
 going to lead to everyone filing bugs that way; it's going to lead to
 the tracker being ignored as useless.

yeah that's why the original proposal had the plan to provide an email
interface that you could CC or forward a mail to that would turn into a
bug report, that would still require someone to actually do that, but it
is probably not different from moving a discussion on -general that
turns out to be a bug to -hackers (or -bugs).

 
 2. Associate some kind of status like OPEN, FIXED, NOTABUG,
 WONTFIX, etc. with each such bug via web interface.
 
 Anything that even pretends to be a bug tracker will do that.  The
 real question is, who is going to keep it up to date?  GSM has the
 right point of view here: we need at least a couple of people who
 are willing to invest substantial amounts of time, or it's not going
 to go anywhere.  Seeing that we can barely manage to keep the mailing
 list moderator positions staffed, I'm not hopeful.

I think that a tracker would require a different kind of volunteer that
is much easier to find than ML-moderation, but I guess unless we
actually try we will never know.


Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: pg_terminate_backend and pg_cancel_backend by not administrator user

2011-05-29 Thread Noah Misch
On Sat, May 28, 2011 at 01:44:20PM -0400, Josh Kupershmidt wrote:
 Anssi and I posted some initial feedback on the patch's goals earlier.
 I would like to ultimately see users have the capability to
 pg_cancel_backend() their own queries. But I could at least conceive
 of others not wanting this behavior enabled by default. So perhaps
 this patch's approach of granting extra privs to the database owner
 could work as a first attempt. And maybe a later version could
 introduce a GUC allowing the DBA to control whether users can
 cancel/terminate their backends, or we could instead have an option
 flag to CREATE/ALTER ROLE, allowing per-user configuration.

What risks arise from unconditionally allowing these calls for the same user's
backends?  `pg_cancel_backend' ought to be safe enough; the user always has
access to the standard cancellation protocol, making the SQL interface a mere
convenience (albeit a compelling one).  `pg_terminate_backend' does open up
access to a new behavior, but no concrete risks come to mind.

On the other hand, this *would* be substantial new authority for database
owners.  Seems like a reasonable authority to grant, though.

 It would be helpful to hear from others whether this patch's goals
 would work as a first pass at this problem, so that Torello doesn't
 waste time on a doomed approach. Also, it might be helpful to add an
 entry on the Todo list for 'allow non-superusers to use
 pg_cancel_backend()', in case this patch gets sunk.
 
 Now, a few technical comments about the patch:
 1.) This bit looks dangerous:
 +backend = pgstat_fetch_stat_beentry(i);
 +if (backend-st_procpid == pid) {
 
 Since pgstat_fetch_stat_beentry() might return NULL.

I think you want BackendPidGetProc().

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Peter Eisentraut
On sön, 2011-05-29 at 00:04 -0400, Tom Lane wrote:
 Many, many, many bug issues are not associated with a bug report
 submitted through the web interface.  People mail stuff to pgsql-bugs
 manually, or issues turn up in threads on other lists.  If a tracker
 can only find things submitted through the web interface, that is not
 going to lead to everyone filing bugs that way; it's going to lead to
 the tracker being ignored as useless.

I think this doesn't necessarily have to be the case.  I think there are
lots of hackers and users who will sign up for any reasonable bug
tracker as soon as it's introduced.  If you want a better treatment for
your bug, send it to the tracker, if you want the old-style treatment,
send it somewhere else.

That doesn't mean that better integration cannot be worked on later, but
this illusion that a bug tracker must have magical total awareness of
the entire flow of information in the project from day one is an
illusion and has blocked this business for too long IMO.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Peter Eisentraut
On sön, 2011-05-29 at 03:23 +, Greg Sabino Mullane wrote:
 My own bare bones wish list for such a tracker is:
 
 * Runs on Postgres
 * Has an email interface

I will add

* Free/open source software

to that.

Here is a list to choose from:
http://en.wikipedia.org/wiki/Comparison_of_issue_tracking_systems

FLOSS with PostgreSQL backend:

OTRS
Request Tracker
LibreSource
MantisBT
Redmine
Flyspray
Roundup
Bugzilla
Trac

The next step would be to investigate the email interface capabilities
of these, and then also research how difficult they are to install and
maintain, and by that time we should be down to about three that we can
try out.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-05-29 Thread Noah Misch
On Wed, May 25, 2011 at 10:07:45PM +0300, Peter Eisentraut wrote:
 On ons, 2011-04-27 at 18:14 -0400, Noah Misch wrote:
  Enthusiastic +1 for this concept.  There's at least one rough edge: it 
  fails if
  you have another postmaster running on port 5432.
 
 This has now been addressed: pg_upgrade accepts PGPORT settings.
 Attached is a slightly updated patch runs the test suite with a port of
 65432, which you can override by setting PGPORT yourself.
 
 Anyway, is this something that people want in the repository?  It's not
 as polished as the pg_regress business, but it is definitely helpful.

I'd like it.  We've had bugs sit for months that would have been found
immediately by a buildfarm member running this test.  Having it in the
repository at least opens up that possibility.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Tom Lane
=?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com writes:
 2011/5/29 Tom Lane t...@sss.pgh.pa.us:
 OK, do you like the attached version of that logic?  (Other fragments
 of the patch as before.)

 The idea was that remove only one page from the VACUUM will prevent
 relfrozenxid update and reltuples (and relpages) update.
 Now, I beleive that once we've skip at least one page thanks to
 SKIP_PAGES_THRESHOLD, then we should be more agressive and remove as
 many as possible pages from the VACUUM, tks to the VM.

That would require proof, not just suggestion.  Skipping pages will
defeat the OS read-ahead algorithm, and so could easily cost more than
reading them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-05-29 Thread Josh Kupershmidt
On Sun, May 29, 2011 at 5:04 AM, Noah Misch n...@leadboat.com wrote:
 What risks arise from unconditionally allowing these calls for the same user's
 backends?  `pg_cancel_backend' ought to be safe enough; the user always has
 access to the standard cancellation protocol, making the SQL interface a mere
 convenience (albeit a compelling one).  `pg_terminate_backend' does open up
 access to a new behavior, but no concrete risks come to mind.

Looking around, I see there were real problems[1] with sending SIGTERM
to individual backends back in 2005 or so, and pg_terminate_backend()
was only deemed safe enough to put in for 8.4 [2]. So expanding
pg_terminate_backend() privileges does make me a tad nervous.

Reading through those old threads made me realize this patch would
give database owners the ability to kill off autovacuum workers. Seems
like we'd want to restrict that power to superusers.

 On the other hand, this *would* be substantial new authority for database
 owners.  Seems like a reasonable authority to grant, though.

And I also realized that this patch's approach might force us to
maintain a permissions wart if we ever want to implement fine-grained
control for this stuff, e.g. a per-role setting enabling self-kills.
It would be a bit lame to have to document Use this CREATE/ALTER ROLE
flag. Or be the database owner. Or be a superuser.

 Now, a few technical comments about the patch:
 1.) This bit looks dangerous:
 +                backend = pgstat_fetch_stat_beentry(i);
 +                if (backend-st_procpid == pid) {

 Since pgstat_fetch_stat_beentry() might return NULL.

 I think you want BackendPidGetProc().

Ah, thanks for the pointer.

Josh

--
[1] 
http://postgresql.1045698.n5.nabble.com/pg-terminate-backend-idea-td1930120.html
[2] 
http://postgresql.1045698.n5.nabble.com/Re-COMMITTERS-pgsql-Add-pg-terminate-backend-to-allow-terminating-only-a-single-td1983763i20.html

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 That doesn't mean that better integration cannot be worked on later, but
 this illusion that a bug tracker must have magical total awareness of
 the entire flow of information in the project from day one is an
 illusion and has blocked this business for too long IMO.

If it has only a partial view of the set of bugs being worked on, it's
not going to meet the goals that are being claimed for it.

I don't doubt that somebody could run around and link every discussion
about a bug into the tracker.  I'm just dubious that that actually
*will* happen with enough reliability to make the tracker more useful
than a mailing-list search.

In the end, I think that requests for a tracker mostly come from people
who are not part of this community, or at least not part of its mailing
lists (which is about the same thing IMO).  If they submitted a bug
report via the lists, they're generally going to get replies via email,
and that seems sufficient to me.  But if they submitted a report via the
web form, they might well be expecting that they can track what's going
on with it on a web page.  And that's not unreasonable.  But we could
fix that without any changes at all in our work processes.  Just have
the webform add a cc: bugbot-bugn...@postgresql.org to each submitted
email, and set up a bot to collect the traffic and display it on a
suitable web page.  (Spam filtering left as an exercise for the reader.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Pavan Deolasee
On Sun, May 29, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 =?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com 
 writes:
 2011/5/29 Tom Lane t...@sss.pgh.pa.us:
 OK, do you like the attached version of that logic?  (Other fragments
 of the patch as before.)

 The idea was that remove only one page from the VACUUM will prevent
 relfrozenxid update and reltuples (and relpages) update.
 Now, I beleive that once we've skip at least one page thanks to
 SKIP_PAGES_THRESHOLD, then we should be more agressive and remove as
 many as possible pages from the VACUUM, tks to the VM.

 That would require proof, not just suggestion.  Skipping pages will
 defeat the OS read-ahead algorithm, and so could easily cost more than
 reading them.


My worry is what we have right now is also based on just assumptions
and gut feelings rather than any numbers.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes:
 On Sun, May 29, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 That would require proof, not just suggestion.  Skipping pages will
 defeat the OS read-ahead algorithm, and so could easily cost more than
 reading them.

 My worry is what we have right now is also based on just assumptions
 and gut feelings rather than any numbers.

So go collect some numbers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum, visibility maps and SKIP_PAGES_THRESHOLD

2011-05-29 Thread Pavan Deolasee
On Fri, May 27, 2011 at 8:40 PM, Greg Stark gsst...@mit.edu wrote:

 Separately it's a bit strange that we actually have to visit the
 pages. We have all the information we need in the VM to determine
 whether there's a run of 32 vacuum-clean pages. Why can't we look at
 the next 32 pages and if they're all vacuum-clean then skip looking at
 the heap at all for them. What we do now is do the regular vacuum
 algorithm and only after we've processed 32 pages in a row realize
 that it was a waste of effort.


May be we want to have visibility map APIs to test if a chunk of pages
are all visible or not. We can then use that API to test every N
blocks (where N is the number where continuous sequential scans would
still be better than sequential scans with gaps) and either read all
of them sequentially or just skip all of them.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Pavan Deolasee
On Sun, May 29, 2011 at 9:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:
 On Sun, May 29, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 That would require proof, not just suggestion.  Skipping pages will
 defeat the OS read-ahead algorithm, and so could easily cost more than
 reading them.

 My worry is what we have right now is also based on just assumptions
 and gut feelings rather than any numbers.

 So go collect some numbers.


I am sorry if I sounded terse above. But my gripe is that sometimes we
are too reluctant to listen to ideas and insist on producing some hard
numbers first which might take significant efforts. But we are not
equally strict when such changes are introduced initially. For
example, in this particular case, the change was introduced after this
discussion:

http://archives.postgresql.org/pgsql-hackers/2008-12/msg01316.php

Heikki suggested 20, Simon proposed 32 to make it a power of 2. But
why not 16 ? Thats closer to 16 than 32. And Greg yesterday said, 8 is
a better number based on his testings.

May be a performance build farm as being discussed is the solution
where we can throw some simple patches and see if something helps or
not.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Joe Abbate
Hi Tom,

On 05/29/2011 11:05 AM, Tom Lane wrote:
 In the end, I think that requests for a tracker mostly come from people
 who are not part of this community, or at least not part of its mailing
 lists (which is about the same thing IMO).

I think that's a bit harsh. I assume you consider GSM a part of the
community and he's asking for a tracker, even going to the trouble of
posting a Help Wanted! article about it.

 If they submitted a bug
 report via the lists, they're generally going to get replies via email,
 and that seems sufficient to me.  But if they submitted a report via the
 web form, they might well be expecting that they can track what's going
 on with it on a web page.  And that's not unreasonable.  But we could
 fix that without any changes at all in our work processes.  Just have
 the webform add a cc: bugbot-bugn...@postgresql.org to each submitted
 email, and set up a bot to collect the traffic and display it on a
 suitable web page.  (Spam filtering left as an exercise for the reader.)

I think there's more to a tracker than having bug submitters find all
the emails related to it. For example, one can use it to aggregate
interesting data, like how many bugs reported per person/email address,
or PostgreSQL version or OS (or may be I'm not aware and something like
this is already going on behind the submission form).

Anyway, I may be willing to do some work on a tracker--if there's
interest-- since at least part of the work could fit in with the
database interface area of the Pyrseas project.

To collect info/discuss, I could use
http://wiki.postgresql.org/wiki/TrackerDiscussion but I see there's a
request to not modify/add anything without talking to Stefan
Kaltenbrunner. Would a new page be preferable?

All the best,

Joe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes:
 I am sorry if I sounded terse above. But my gripe is that sometimes we
 are too reluctant to listen to ideas and insist on producing some hard
 numbers first which might take significant efforts. But we are not
 equally strict when such changes are introduced initially.

The reason for not wanting to change it without some actual evidence
is that there is already evidence: the code has been in the field with
this setting since 8.4, and nobody's vacuum performance has fallen off a
cliff.  So while I'd agree that there was little testing done before the
code went in, there is more than zero reason to leave it where it is.
Without some positive evidence showing that another value is better,
I'm disinclined to change it.  I also think that you're not helping
by complaining about the code without being willing to do some work
to try to collect such evidence.

 Heikki suggested 20, Simon proposed 32 to make it a power of 2. But
 why not 16 ? Thats closer to 16 than 32. And Greg yesterday said, 8 is
 a better number based on his testings.

Greg said he had found that the read speed was the same for reading
every page vs reading every 8th page.  That's not the same as concluding
that 8 is the optimal skip distance for vacuum; or at least, he didn't
say that's what he had concluded.  vacuum isn't just reading ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Pavan Deolasee
On Sun, May 29, 2011 at 10:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:
 I am sorry if I sounded terse above. But my gripe is that sometimes we
 are too reluctant to listen to ideas and insist on producing some hard
 numbers first which might take significant efforts. But we are not
 equally strict when such changes are introduced initially.

 The reason for not wanting to change it without some actual evidence
 is that there is already evidence: the code has been in the field with
 this setting since 8.4, and nobody's vacuum performance has fallen off a
 cliff.

Well, that's probably because there was definitely much improvement
over what existed before. But that does not mean we can't make it
better. IOW there are no complaints because there is no regression.

 So while I'd agree that there was little testing done before the
 code went in, there is more than zero reason to leave it where it is.
 Without some positive evidence showing that another value is better,
 I'm disinclined to change it.  I also think that you're not helping
 by complaining about the code without being willing to do some work
 to try to collect such evidence.


I am not complaining about the code. I am suggesting we can be more
receptive to ideas, especially when we know what we have today was not
backed by any evidence either. I will anyways do some tests and post
numbers when I work on single-pass vacuum patch. I'll try to
experiment with this stuff at that time.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Stefan Kaltenbrunner
On 05/29/2011 05:47 PM, Joe Abbate wrote:
 Hi Tom,
 
 On 05/29/2011 11:05 AM, Tom Lane wrote:
 In the end, I think that requests for a tracker mostly come from people
 who are not part of this community, or at least not part of its mailing
 lists (which is about the same thing IMO).
 
 I think that's a bit harsh. I assume you consider GSM a part of the
 community and he's asking for a tracker, even going to the trouble of
 posting a Help Wanted! article about it.
 
 If they submitted a bug
 report via the lists, they're generally going to get replies via email,
 and that seems sufficient to me.  But if they submitted a report via the
 web form, they might well be expecting that they can track what's going
 on with it on a web page.  And that's not unreasonable.  But we could
 fix that without any changes at all in our work processes.  Just have
 the webform add a cc: bugbot-bugn...@postgresql.org to each submitted
 email, and set up a bot to collect the traffic and display it on a
 suitable web page.  (Spam filtering left as an exercise for the reader.)

[...]

 To collect info/discuss, I could use
 http://wiki.postgresql.org/wiki/TrackerDiscussion but I see there's a
 request to not modify/add anything without talking to Stefan
 Kaltenbrunner. Would a new page be preferable?

feel free to reuse/edit the page as you like it(I have just removed the
notice) - the don't edit thingy was added because people started to
find the page via google (while searching for a tracker/bugreporting
tool) and considered it official status information or a way to
sell^pitch their preferred tool to me personally.



Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgbench--new transaction type

2011-05-29 Thread Jeff Janes
If you use pgbench -S -M prepared at a scale where all data fits in
memory, most of what you are benchmarking is network/IPC chatter, and
table locking.  Which is fine if that is what you want to do.  This
patch adds a new transaction type of -P, which does the same thing as
-S but it moves the main loop of selects, 10,000 at at time, into
pl/pgSQL.  This does a good job of exercising the executor rather than
IPC.

This can simulate workloads that have primary key look ups as the
inner side of large nested loop.  It is also useful for isolating and
profiling parts of the backend code.

I did not implement this as a new query mode (-M plpgsql), because the
lack of transaction control in pl/pgSQL means it can only be used for
select-only transactions rather than as a general method.  So I
thought a new transaction type made more sense.

I didn't implement it as a custom file using -f because:
1) It seems to be a natural extension of the existing built-ins.  Also
-f is fiddly.  Several times I've wanted to ask posters who are
discussing the other built in transactions to run something like this
and report back, which is easier to do if it is also builtin.
2) It uses a initialization code which -f does not support.
3) I don't see how I can make it automatically detect and respond to
:scale if it were run under -f.

Perhaps issues 2 and 3 would be best addressed by extending the
general -f facility, but that would be a lot more work, and I don't
know how well received it would be.

The reporting might be an issue.  I don't want to call it TPS when it
is really not a transaction being reported, so for now I've just left
the TPS as as true transactions, and added a separate reporting line
for selects per second.

I know I also need to add to the web-docs, but I'm hoping to wait on
that until I get some feedback on whether the whole approach is
considered to be viable or not.

some numbers for single client runs on 64-bit AMD Opteron Linux:

12,567 sps  under -S
19,646 sps  under -S -M prepared
58,165 sps  under -P

Cheers,

Jeff
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index bb18c89..9a43fd5 100644
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** int			fillfactor = 100;
*** 130,135 
--- 130,136 
   * -s instead */
  #define ntellers	10
  #define naccounts	10
+ #define plpgsql_loops	1
  
  bool		use_log;			/* log transaction latencies to a file */
  bool		is_connect;			/* establish connection for each transaction */
*** static char *select_only = {
*** 273,278 
--- 274,284 
  	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
  };
  
+ /* -P case */
+ static char *select_only_plpgsql = {
+ 	SELECT pgbench_query( CppAsString2(naccounts)  *:scale, CppAsString2(plpgsql_loops) );\n
+ };
+ 
  /* Function prototypes */
  static void setalarm(int seconds);
  static void *threadRun(void *arg);
*** printResults(int ttype, int normal_xacts
*** 1686,1691 
--- 1692,1699 
  		s = Update only pgbench_accounts;
  	else if (ttype == 1)
  		s = SELECT only;
+ 	else if (ttype == 4)
+ 		s = SELECT only via plpgsql;
  	else
  		s = Custom query;
  
*** printResults(int ttype, int normal_xacts
*** 1708,1713 
--- 1716,1726 
  	}
  	printf(tps = %f (including connections establishing)\n, tps_include);
  	printf(tps = %f (excluding connections establishing)\n, tps_exclude);
+ if (ttype==4)
+ 	{
+ 		printf(selects per second = %f (including connections establishing)\n, tps_include*plpgsql_loops);
+ 		printf(selects per second = %f (excluding connections establishing)\n, tps_exclude*plpgsql_loops);
+ 	};
  
  	/* Report per-command latencies */
  	if (is_latencies)
*** main(int argc, char **argv)
*** 1766,1772 
  	int			is_no_vacuum = 0;		/* no vacuum at all before testing? */
  	int			do_vacuum_accounts = 0; /* do vacuum accounts before testing? */
  	int			ttype = 0;		/* transaction type. 0: TPC-B, 1: SELECT only,
!  * 2: skip update of branches and tellers */
  	char	   *filename = NULL;
  	bool		scale_given = false;
  
--- 1779,1785 
  	int			is_no_vacuum = 0;		/* no vacuum at all before testing? */
  	int			do_vacuum_accounts = 0; /* do vacuum accounts before testing? */
  	int			ttype = 0;		/* transaction type. 0: TPC-B, 1: SELECT only,
!  * 2: skip update of branches and tellers 3: custom sql, 4: SELECT only via plpgsql*/
  	char	   *filename = NULL;
  	bool		scale_given = false;
  
*** main(int argc, char **argv)
*** 1823,1829 
  	state = (CState *) xmalloc(sizeof(CState));
  	memset(state, 0, sizeof(CState));
  
! 	while ((c = getopt(argc, argv, ih:nvp:dSNc:j:Crs:t:T:U:lf:D:F:M:)) != -1)
  	{
  		switch (c)
  		{
--- 1836,1842 
  	state = (CState *) xmalloc(sizeof(CState));
  	memset(state, 0, sizeof(CState));
  
! 	while ((c = getopt(argc, argv, ih:nvp:dSNPc:j:Crs:t:T:U:lf:D:F:M:)) != -1)
  	{
  		switch (c)
  		{

Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Joe Abbate
On 05/29/2011 02:01 PM, Stefan Kaltenbrunner wrote:
 feel free to reuse/edit the page as you like it(I have just removed the
 notice) - the don't edit thingy was added because people started to
 find the page via google (while searching for a tracker/bugreporting
 tool) and considered it official status information or a way to
 sell^pitch their preferred tool to me personally.

Thanks Stefan.

I've summarizes the main points made in the recent discussion and did
some minor additional research on the lists suggested by Peter and Chris
Browne.  Anyone interested in the tracker, please visit
http://wiki.postgresql.org/wiki/TrackerDiscussion and add your
feedback/input.

All the best,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Cédric Villemain
2011/5/29 Tom Lane t...@sss.pgh.pa.us:
 =?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com 
 writes:
 2011/5/29 Tom Lane t...@sss.pgh.pa.us:
 OK, do you like the attached version of that logic?  (Other fragments
 of the patch as before.)

 The idea was that remove only one page from the VACUUM will prevent
 relfrozenxid update and reltuples (and relpages) update.
 Now, I beleive that once we've skip at least one page thanks to
 SKIP_PAGES_THRESHOLD, then we should be more agressive and remove as
 many as possible pages from the VACUUM, tks to the VM.

 That would require proof, not just suggestion.  Skipping pages will
 defeat the OS read-ahead algorithm, and so could easily cost more than
 reading them.

Correct, it needs proof. Parenthesis: I did learn also that reading
the first block of a file make read-ahead have its larger window from
the beginning (the one that posix_fadvise_sequential set too), so
remove those initial reads might be counter-productive also. But this
is damn hard to benchmark because the read ahead is also influenced by
memory pressure for example.

From theory, 1. readahead algo is a bit smarter and can work with
read-with-holes (if the holes are not larger than the read-ahead
window) and 2. if holes are that large then maybe it is not so good to
keep a larger read-ahead window (which keep trashing our buffer
cache).




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench--new transaction type

2011-05-29 Thread Greg Smith

On 05/29/2011 03:11 PM, Jeff Janes wrote:

If you use pgbench -S -M prepared at a scale where all data fits in
memory, most of what you are benchmarking is network/IPC chatter, and
table locking.


If you profile it, you'll find a large amount of the time is actually 
spent doing more mundane things, like memory allocation.  The network 
and locking issues are really not the bottleneck at all in a surprising 
number of these cases.  Your patch isn't really dependent on your being 
right about the cause here, which means this doesn't impact your 
submissions any.  Just wanted to clarify that what people expect are 
slowing things down in this situation and what actually shows up when 
you profile are usually quite different.


I'm not sure whether this feature makes sense to add to pgbench, but 
it's interesting to have it around for developer testing.  The way 
you've built this isn't messing with the code too much to accomplish 
that, and your comments about it being hard to do this using -f are 
all correct.  Using a custom test file aims to shoot your foot unless 
you apply a strong grip toward doing otherwise.



some numbers for single client runs on 64-bit AMD Opteron Linux:
12,567 sps  under -S
19,646 sps  under -S -M prepared
58,165 sps  under -P
   


10,000 is too big of a burst to run at once.  The specific thing I'm 
concerned about is what happens if you try this mode when using -T to 
enforce a runtime limit, and your SELECT rate isn't high.  If you're 
only doing 100 SELECTs/second because your scale is big enough to be 
seek bound, you could overrun by nearly two minutes.


I think this is just a matter of turning the optimization around a bit.  
Rather than starting with a large batch size and presuming that's ideal, 
in this case a different approach is really needed.  You want the 
smallest batch size that gives you a large win here.  My guess is that 
most of the gain here comes from increasing batch size to something in 
the 10 to 100 range; jumping to 10K is probably overkill.  Could you try 
some smaller numbers and see where the big increases start falling off at?


Obligatory code formatting nitpick:  try not to overrun the right margin 
any further than the existing code around line 1779, where you add more 
ttype comments.  That needs to turn into a multi-line comment.  Rest of 
the patch looks fine, and don't worry about resubmitting for that; just 
something to tweak on your next update.  A slightly more descriptive 
filename for the patch would help out those of us who look at a lot of 
pgbench patches, too.  Something like pgbench_loop_v1.patch for 
example would make it easier for me to remember which patch this was by 
its name.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting a bug tracker for the Postgres project

2011-05-29 Thread Greg Stark
On Sun, May 29, 2011 at 3:36 PM, Joe Abbate j...@freedomcircle.com wrote:
  Anyone interested in the tracker, please visit
 http://wiki.postgresql.org/wiki/TrackerDiscussion and add your
 feedback/input.

I think this illustrates exactly what we *don't* want to happen with a
bug tracker. We want the discussion to stay *here* not on some other
medium accessible only through the web and editable only through a web
interface

Also your summary seems to have missed the point on the has email
interface requirement. The table of features you listed has just
Creation of bugs via mail interface as the only feature that is
accessible from email.

I'm not sure what Robert meant but I suspect he meant what I would
want which is the ability to add comments, close bugs, set other
properties, etc. By email. My biggest gripe about bugzilla was that it
sent you an email with updates to the bug but you couldn't respond to
that email.

My ideal bug tracker is the debian one which basically stays out of
your way and lets you cc any message to a specific bug at
n...@bugs.debian.org which archives that message in the bug and sends
it to anyone listening to the bug. And you can have control commands
to close it or edit it -- basically making all our existing that's
not a bug bleah bleah messages into close nnn; that's not a bug
bleah bleah messages.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers