Re: [HACKERS] Constraint Exclusion + Joins?

2006-04-27 Thread Tom Lane
"Brandon Black" <[EMAIL PROTECTED]> writes:
> I was wondering (for planning purposes) if anyone knew the status of
> constraint exclusions moving up to query runtime and working for
> joins.

The latter, done; the former, not on the radar screen IMHO.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Jonah H. Harris
On 4/27/06, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> Is the source easier to maintain?

Yes, aside from extra lookahead, that was my main motivation.


--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Alvaro Herrera
Jonah H. Harris wrote:
> On 4/27/06, Christopher Kings-Lynne <[EMAIL PROTECTED]> wrote:
> > Is it faster?  How much faster?
> 
> I'm not sure, I haven't done direct timings on it vs. the bison
> version.  When I wrote it, I wasn't really concerned with the time it
> took to parse.

Is the source easier to maintain?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Constraint Exclusion + Joins?

2006-04-27 Thread Brandon Black
I was wondering (for planning purposes) if anyone knew the status of
constraint exclusions moving up to query runtime and working for
joins.  Is this something that's coming down the pipe in the
foreseeable future, or just on a back-burner to-do list, or probably
never happening, or... ?

I have a painful work-around for my particular case that's good enough
for now, but it would be helpful to know whether there's a good
probability I can convert my code to do things the easy way somewhere
in the foreseeable future if/when this feature goes in, or whether I
should consider design changes now before my problems grow.

Thanks,
Brandon

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Christopher Kings-Lynne

I suggest that maybe the cleanest solution is to not use log level at
all for this, but to invent a separate "autovacuum_verbosity" setting
that controls how many messages autovac tries to log, using the above
scale.  Anything it does try to log can just come out at LOG message
setting.


+1


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Christopher Kings-Lynne

Those messages were at LOG level because otherwise it's difficult to be
sure from the log that autovac is running at all.


OK, so what do we want to do?  Clearly outputing something everytime
pg_autovacuum touches a database isn't ideal.  By default, the server
logs should show significant events, which this is not.

Do we want something output only the first time autovacuum runs?


I've considered several times proposing that I want to be able to turn 
off or do something about autovacuum log messages.  I just always 
thought it would be rejected.


I have it set up so that I get the last few hundred lines of my postgres 
logs mailed to me each day.  However, most of the time I just get a few 
hundred autovacuum messages.  So, I had to much around with grepping out 
the autovacuum lines, etc.


I personally don't see the point of there being s many of those 
autovacuum log messages...


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Jonah H. Harris
On 4/27/06, Christopher Kings-Lynne <[EMAIL PROTECTED]> wrote:
> Is it faster?  How much faster?

I'm not sure, I haven't done direct timings on it vs. the bison
version.  When I wrote it, I wasn't really concerned with the time it
took to parse.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Christopher Kings-Lynne

For the sake of saying again, I already have a recursive-descent
parser for PostgreSQL written in a PCCTS grammar.  It's something I
started writing years ago, but I'd be willing to consider open
sourcing it if the PostgreSQL community will really entertain the
thought of switching.

Unfortunately, this discussion usually ends up with, "why would we
want to change what we have now when it already works?"


Is it faster?  How much faster?



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Rocco Altier
> I suspect a patch to convert PostgreSQL to C++ wouldn't be
> welcomed. Haha...
> 
Checking my calendar, I think you are about 26 days too late to make
that suggestion...

-rocco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Larry Rosenman
Simon Riggs wrote:
> On Thu, 2006-04-27 at 14:53 -0400, Tom Lane wrote:
>> "Larry Rosenman" <[EMAIL PROTECTED]> writes:
>>> I'd like to see a more concrete definition of what we
>>> want Autovacuum to output and at what levels.
>> 
>> autovacuum_verbosity
> 
> Should we call it autovacuum_messages?
> 
> In current usage...
> 
> _verbosity controls how much information each message gives
> _messages controls what types of messages are logged

That probably works, but I'm not sure about the one to add the VERBOSE
to the
VACUUM commands autovacuum.c emits. 



-- 
Larry Rosenman
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Simon Riggs
On Thu, 2006-04-27 at 14:53 -0400, Tom Lane wrote:
> "Larry Rosenman" <[EMAIL PROTECTED]> writes:
> > I'd like to see a more concrete definition of what we 
> > want Autovacuum to output and at what levels. 
> 
> autovacuum_verbosity

Should we call it autovacuum_messages?

In current usage...

_verbosity controls how much information each message gives
_messages controls what types of messages are logged

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread mark
On Thu, Apr 27, 2006 at 02:42:35PM -0500, Taral wrote:
> On 27 Apr 2006 15:25:45 -0400, Greg Stark <[EMAIL PROTECTED]> wrote:
> > It would be pretty cool to have a type-safe codebase. It just seems like
> > too an awful lot of work for a mostly aesthetic improvement.

> Does anyone have some benchmarks I can run? I can run tests to see if
> this aliasing makes a noticeable difference or not...

You would have to fix the code first. :-)

I completely back down when it comes down to the suspected
intrusiveness of the change, causing too much upset. It's the idea
that the aliasing rules are not worth following, that caused me to
enter. Until a good patch is available, that is less intrusive than
what people are suspecting, nothing is going to change.

I suspect a patch to convert PostgreSQL to C++ wouldn't be
welcomed. Haha...

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Larry Rosenman
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
>> Also it'd be nice to have it a (4th?) level which would show the
>> results of the equations being applied.
> 
> That I think would fall more naturally into the category of debug
> support --- I'm happy if we just emit that at DEBUG1 and let people
> select it with log_min_messages.
> 
>   regards, tom lane

I was going to make that same comment, as this seems to be more
implementation
detail, which should be at DEBUGn.

LER

-- 
Larry Rosenman  
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531 

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Also it'd be nice to have it a (4th?) level which would show the results
> of the equations being applied.

That I think would fall more naturally into the category of debug
support --- I'm happy if we just emit that at DEBUG1 and let people
select it with log_min_messages.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Alvaro Herrera
Larry Rosenman wrote:
> Tom Lane wrote:
> > Chris Browne <[EMAIL PROTECTED]> writes:
> >> At "level 2," it seems to me that it would be quite useful to have
> >> some way of getting at the verbose output of VACUUM.
> > 
> > I think you can do that now, if you set min_log_level to INFO. 
> > However, it might be cleaner if we allowed a "level 3" that made all
> > of autovac's vacuums be VERBOSE.
> 
> I was thinking along those exact lines.  (A 3rd level).

Also it'd be nice to have it a (4th?) level which would show the results
of the equations being applied.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Larry Rosenman
Tom Lane wrote:
> Chris Browne <[EMAIL PROTECTED]> writes:
>> At "level 2," it seems to me that it would be quite useful to have
>> some way of getting at the verbose output of VACUUM.
> 
> I think you can do that now, if you set min_log_level to INFO. 
> However, it might be cleaner if we allowed a "level 3" that made all
> of autovac's vacuums be VERBOSE.
> 

I was thinking along those exact lines.  (A 3rd level).

LER

-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: ler@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Tom Lane
Chris Browne <[EMAIL PROTECTED]> writes:
> At "level 2," it seems to me that it would be quite useful to have
> some way of getting at the verbose output of VACUUM.

I think you can do that now, if you set min_log_level to INFO.  However,
it might be cleaner if we allowed a "level 3" that made all of autovac's
vacuums be VERBOSE.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Larry Rosenman
Bruce Momjian wrote:
> Uh, while you are at it, the background writer and checkpoint
> operations need similar treatment.  :-)
> 
Sure.

I'm willing to look at and work it out, if no one else is currently
working on it.

LER

>

---
> 
>>> 

>> 
>> This sounds like a winner to me.  Anyone else want to grab it?  I'm
>> in the position to try and do this, but don't want to step on anyone
>> else's toes. 
>> 
>> LER
-- 
Larry Rosenman  
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531 

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Bruce Momjian

Uh, while you are at it, the background writer and checkpoint operations
need similar treatment.  :-)

---

Larry Rosenman wrote:
> Tom Lane wrote:
> > "Larry Rosenman" <[EMAIL PROTECTED]> writes:
> >> I'd like to see a more concrete definition of what we
> >> want Autovacuum to output and at what levels.
> > 
> > I would argue that what people typically want is
> > 
> > (0) nothing
> > 
> > (1) per-database log messages
> > 
> > or
> > 
> > (2) per-table log messages (including per-database)
> > 
> > The first problem is that (2) is only available at DEBUG2 or below,
> > which is not good because that also clutters the log with a whole lot
> > of implementer-level debugging info.
> > 
> > The second problem is that we don't really want to use the global
> > log_min_messages setting to determine this, because that constrains
> > your decision about how much chatter you want from ordinary backends.
> > 
> > I suggest that maybe the cleanest solution is to not use log level at
> > all for this, but to invent a separate "autovacuum_verbosity" setting
> > that controls how many messages autovac tries to log, using the above
> > scale.  Anything it does try to log can just come out at LOG message
> > setting.
> 
> This sounds like a winner to me.  Anyone else want to grab it?  I'm 
> in the position to try and do this, but don't want to step on anyone
> else's toes.
> 
> LER
> 
> > 
> > regards, tom lane
> 
> 
> 
> -- 
> Larry Rosenman
> Database Support Engineer
> 
> PERVASIVE SOFTWARE. INC.
> 12365B RIATA TRACE PKWY
> 3015
> AUSTIN TX  78727-6531
> 
> Tel: 512.231.6173
> Fax: 512.231.6597
> Email: [EMAIL PROTECTED]
> Web: www.pervasive.com
> 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Chris Browne
[EMAIL PROTECTED] (Tom Lane) writes:

> "Larry Rosenman" <[EMAIL PROTECTED]> writes:
>> I'd like to see a more concrete definition of what we 
>> want Autovacuum to output and at what levels. 
>
> I would argue that what people typically want is
>
>   (0) nothing
>
>   (1) per-database log messages
>
> or
>
>   (2) per-table log messages (including per-database)
>
> The first problem is that (2) is only available at DEBUG2 or below,
> which is not good because that also clutters the log with a whole
> lot of implementer-level debugging info.
>
> The second problem is that we don't really want to use the global
> log_min_messages setting to determine this, because that constrains
> your decision about how much chatter you want from ordinary
> backends.
>
> I suggest that maybe the cleanest solution is to not use log level
> at all for this, but to invent a separate "autovacuum_verbosity"
> setting that controls how many messages autovac tries to log, using
> the above scale.  Anything it does try to log can just come out at
> LOG message setting.

At "level 2," it seems to me that it would be quite useful to have
some way of getting at the verbose output of VACUUM.

Consider when I vacuum a table, thus:

/* [EMAIL PROTECTED]/dba2 performance=*/ vacuum verbose analyze days;
INFO:  vacuuming "public.days"
INFO:  "days": found 0 removable, 1893 nonremovable row versions in 9 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.03 sec.
INFO:  analyzing "public.days"
INFO:  "days": 9 pages, 1893 rows sampled, 1893 estimated total rows
VACUUM

The only thing that PostgreSQL will log generally about this is, if
the query runs for a while, that I requested "vacuum verbose analyze
days;", and that this took 4284ms to run.

It would be really nice if we could have some way of logging the
details, namely of numbers of row versions removed/nonremovable, and
of pages affected.

If we could regularly log that sort of information, that could be very
useful in figuring out some "more nearly optimal" schedule for
vacuuming.

One of our people wrote a Perl script that will take verbose VACUUM
output and essentially parses it so as to be able to generate a bunch
of SQL queries to try to collect how much time was spent, and what
sorts of changes got accomplished.

At present, getting anything out of that mandates that every VACUUM
request have stdout tied to this Perl script, which I'm not overly
keen on, for any number of reasons, notably:

- Any vacuums run separately aren't monitored at all

- Parsing not-forcibly-stable-across-versions file formats with Perl
  is a fragile thing

- Ideally, this would be nice to get into the PG "engine," somewhere,
  whether as part of standard logging, or as part of how pg_autovacuum
  works...

Having some ability to collect statistics about "we recovered 42 pages
from table foo at 12:45" would seem useful both from an immediate
temporal perspective where it could suggest whether specific tables
were being vacuumed too (seldom|often), and from a more
global/analytic perspective of perhaps suggesting better kinds of
vacuuming policies.  (In much the same way that I'd like to have some
way of moving towards an analytically better value for
default_statistics_target than 10...)

If people are interested, I could provide a copy of the "analyze
VACUUM stats" script...
-- 
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/sgml.html
"I would rather spend 10 hours reading someone else's source code than
10  minutes listening  to Musak  waiting for  technical  support which
isn't." -- Dr. Greg Wettstein, Roger Maris Cancer Center

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> Doesn't achieve the same thing, unless you mandate that every part of
>> the system use the identical massively-overloaded union struct to refer
>> to every node.  

> Are you saying it's important to preserve the ability for modules to add new
> node types without notifying the rest of the code? I thought all the node
> types were defined in a single header file currently anyways.

No, they're scattered through at least half a dozen headers.  We don't
really have "dynamic" extensibility because the NodeTag enum is declared
in just one place, but we do have separation of concerns: stuff dealing
with parsenodes.h need not import all the executor node types, for
instance.  The big objection to a single-union-struct approach is that
it forecloses any locality of knowledge about node types.

(I also wonder how many compilers would give up the ghost entirely on
being fed a union type of 250+ separate structs, with in aggregate
several thousand fields ... and even if they didn't fail, how fast
they'd compile code making heavy use of such a beast.)

> It would be pretty cool to have a type-safe codebase. It just seems like too
> an awful lot of work for a mostly aesthetic improvement.

Yeah, I think that's how we all feel.  I'd not even waste any time
thinking about it, except that I'm afraid the compiler guys may
eventually stop supporting the traditional semantics ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Taral
On 27 Apr 2006 15:25:45 -0400, Greg Stark <[EMAIL PROTECTED]> wrote:
> It would be pretty cool to have a type-safe codebase. It just seems like too
> an awful lot of work for a mostly aesthetic improvement.

Does anyone have some benchmarks I can run? I can run tests to see if
this aliasing makes a noticeable difference or not...

--
Taral <[EMAIL PROTECTED]>
"You can't prove anything."
-- Gödel's Incompetence Theorem

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Greg Stark <[EMAIL PROTECTED]> writes:
>
> > There are other ways of achieving the same thing. Structs containing a union
> > for the subclass fields for example.
> 
> Doesn't achieve the same thing, unless you mandate that every part of
> the system use the identical massively-overloaded union struct to refer
> to every node.  

Are you saying it's important to preserve the ability for modules to add new
node types without notifying the rest of the code? I thought all the node
types were defined in a single header file currently anyways.

> That would (a) defeat the purpose of extensibility, and (b) make the code
> more error prone not less so (since it'd be notationally easy to refer to a
> field that's not actually present in the given node subtype).

You could use a local pointer to be preserve the existing model of a single
point where the decision is made. That could be encapsulated in a macro that
included an assertion to verify the type tag.

It would be pretty cool to have a type-safe codebase. It just seems like too
an awful lot of work for a mostly aesthetic improvement.

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Larry Rosenman
Tom Lane wrote:
> "Larry Rosenman" <[EMAIL PROTECTED]> writes:
>> I'd like to see a more concrete definition of what we
>> want Autovacuum to output and at what levels.
> 
> I would argue that what people typically want is
> 
>   (0) nothing
> 
>   (1) per-database log messages
> 
> or
> 
>   (2) per-table log messages (including per-database)
> 
> The first problem is that (2) is only available at DEBUG2 or below,
> which is not good because that also clutters the log with a whole lot
> of implementer-level debugging info.
> 
> The second problem is that we don't really want to use the global
> log_min_messages setting to determine this, because that constrains
> your decision about how much chatter you want from ordinary backends.
> 
> I suggest that maybe the cleanest solution is to not use log level at
> all for this, but to invent a separate "autovacuum_verbosity" setting
> that controls how many messages autovac tries to log, using the above
> scale.  Anything it does try to log can just come out at LOG message
> setting.

This sounds like a winner to me.  Anyone else want to grab it?  I'm 
in the position to try and do this, but don't want to step on anyone
else's toes.

LER

> 
>   regards, tom lane



-- 
Larry Rosenman
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Jonah H. Harris
On 4/27/06, Tom Lane <[EMAIL PROTECTED]> wrote:
> I suggest that maybe the cleanest solution is to not use log level at
> all for this, but to invent a separate "autovacuum_verbosity" setting
> that controls how many messages autovac tries to log, using the above
> scale.  Anything it does try to log can just come out at LOG message
> setting.

/me agrees this is by all accounts, the best and cleanest option.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Tom Lane
"Larry Rosenman" <[EMAIL PROTECTED]> writes:
> I'd like to see a more concrete definition of what we 
> want Autovacuum to output and at what levels. 

I would argue that what people typically want is

(0) nothing

(1) per-database log messages

or

(2) per-table log messages (including per-database)

The first problem is that (2) is only available at DEBUG2 or below,
which is not good because that also clutters the log with a whole lot of
implementer-level debugging info.

The second problem is that we don't really want to use the global
log_min_messages setting to determine this, because that constrains
your decision about how much chatter you want from ordinary backends.

I suggest that maybe the cleanest solution is to not use log level at
all for this, but to invent a separate "autovacuum_verbosity" setting
that controls how many messages autovac tries to log, using the above
scale.  Anything it does try to log can just come out at LOG message
setting.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 3.

2006-04-27 Thread Teodor Sigaev



The ideal thing would be for GIN to return a count of the number of
distinct heap tuples referenced by the entries in the index, but I
suppose that would be impractical for VACUUM to compute.


Of course, in this case we should collect heap pointers in memory..

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Larry Rosenman
Bruce Momjian wrote:
> Matthew T. O'Connor wrote:
>> I think there are two things people typically want to know from the
>> logs: 1) Is autovacuum running 2) Did autovacuum take action (issue
>> a VACUUM or ANALYZE) 
>> 
>> I don't think we need mention the name of each and every database we
>> touch, we can, but it should be at a lower level like DEBUG1 or
>> something. 
> 
> OK, that part is done.
> 
>> I don't know what logging level these thing should go at, but I for
>> one would like them to be fairly high easy to get to, perhaps NOTICE?
> 
> Interesting idea.  I had forgotten that for server messages, LOG is at
> the top, and ERROR, NOTICE, etc are below it.  We could make them
> NOTICE, but then all user NOTICE messages appear in the logs too.
> Yuck. 
> 
> Do we want to LOG everytime autovacuum does something?  Is that going
> to fill up the logs worse than the per-database line?

My general take is I (as an admin), want to know that:
a) autovacuum is doing it's periodic checks
b) when it actually vacuums a (database|table) we know what time it did
   it. 


> 
> The real issue is that we give users zero control over what autovacuum
> logs, leading to the TODO item.  I guess the question is until the
> TODO item is done, what do we want to do?
> 
> How do people like the idea of having this in postgresql.conf:
> 
>   autovacuum_set = 'set log_min_messages = ''error'''
> 
> and set autovacuum to output notice/info/error messages as desired by
> the administrator?  This shouldn't be too hard to do, and it is very
> flexible.

We definitely need to do "something" wrt autovacuum messages,
but this doesn't say what gets logged at what level for autovacuum.

I'd like to see a more concrete definition of what we 
want Autovacuum to output and at what levels. 

LER

-- 
Larry Rosenman  
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531 

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Bruce Momjian
Matthew T. O'Connor wrote:
> I think there are two things people typically want to know from the logs:
> 1) Is autovacuum running
> 2) Did autovacuum take action (issue a VACUUM or ANALYZE)
> 
> I don't think we need mention the name of each and every database we 
> touch, we can, but it should be at a lower level like DEBUG1 or something.

OK, that part is done.

> I don't know what logging level these thing should go at, but I for one 
> would like them to be fairly high easy to get to, perhaps NOTICE?

Interesting idea.  I had forgotten that for server messages, LOG is at
the top, and ERROR, NOTICE, etc are below it.  We could make them
NOTICE, but then all user NOTICE messages appear in the logs too. Yuck.

Do we want to LOG everytime autovacuum does something?  Is that going to
fill up the logs worse than the per-database line?

The real issue is that we give users zero control over what autovacuum
logs, leading to the TODO item.  I guess the question is until the TODO
item is done, what do we want to do?

How do people like the idea of having this in postgresql.conf:

autovacuum_set = 'set log_min_messages = ''error'''

and set autovacuum to output notice/info/error messages as desired by
the administrator?  This shouldn't be too hard to do, and it is very
flexible.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Logging pg_autovacuum

2006-04-27 Thread Matthew T. O'Connor

I think there are two things people typically want to know from the logs:
1) Is autovacuum running
2) Did autovacuum take action (issue a VACUUM or ANALYZE)

I don't think we need mention the name of each and every database we 
touch, we can, but it should be at a lower level like DEBUG1 or something.


I don't know what logging level these thing should go at, but I for one 
would like them to be fairly high easy to get to, perhaps NOTICE?



Matt


Bruce Momjian wrote:

Tom Lane wrote:

[EMAIL PROTECTED] (Bruce Momjian) writes:

Change log message about vacuuming database name from LOG to DEBUG1.
Prevents duplicate meaningless log messsages.

Could we have some discussion about this sort of thing, rather than
unilateral actions?

Those messages were at LOG level because otherwise it's difficult to be
sure from the log that autovac is running at all.


OK, so what do we want to do?  Clearly outputing something everytime
pg_autovacuum touches a database isn't ideal.  By default, the server
logs should show significant events, which this is not.

Do we want something output only the first time autovacuum runs?



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Martijn van Oosterhout
On Thu, Apr 27, 2006 at 12:21:55PM -0500, Taral wrote:
> > If we do subclassing like this:
> >
> > struct Node { ... };
> > struct Value { struct Node; ... };
> > etc.
> >
> > do we still run into the alias problem?
> 
> Nope, it appears to get rid of the alias problem completely. But it
> requires anonymous structure support (C99?) to work without changing
> anything other than headers.

On that compiler maybe, but what about others? What other problems does
strict-alias cause? Anyway, near as I can tell, anonymous structs are
not in the C standard and I don't think GCC supports them either...

> As a bonus, if we ever change Node, we don't have to update any other
> structures...

Node is unlikely to ever change, it's been like this for at least ten
years...

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


[HACKERS] Logging pg_autovacuum

2006-04-27 Thread Bruce Momjian
Tom Lane wrote:
> [EMAIL PROTECTED] (Bruce Momjian) writes:
> > Change log message about vacuuming database name from LOG to DEBUG1.
> > Prevents duplicate meaningless log messsages.
> 
> Could we have some discussion about this sort of thing, rather than
> unilateral actions?
> 
> Those messages were at LOG level because otherwise it's difficult to be
> sure from the log that autovac is running at all.

OK, so what do we want to do?  Clearly outputing something everytime
pg_autovacuum touches a database isn't ideal.  By default, the server
logs should show significant events, which this is not.

Do we want something output only the first time autovacuum runs?

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Taral
On 4/27/06, Taral <[EMAIL PROTECTED]> wrote:
> If we do subclassing like this:
>
> struct Node { ... };
> struct Value { struct Node; ... };
> etc.
>
> do we still run into the alias problem?

Nope, it appears to get rid of the alias problem completely. But it
requires anonymous structure support (C99?) to work without changing
anything other than headers.

As a bonus, if we ever change Node, we don't have to update any other
structures...

--
Taral <[EMAIL PROTECTED]>
"You can't prove anything."
-- Gödel's Incompetence Theorem

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 3.

2006-04-27 Thread Teodor Sigaev

How about adding a column to pg_am indicating "these indexes must always
keep same tuple count as heap".  This would be true for all current AMs,
false for GIN.


Yes, it's simplest solution, but it doesn't check of index consistency.

Possible, we can count number of itempointers to heap tuple during build/insert, 
and during bulkdelete we count number of deleted and leaved itempointers. So,


N[before bulkdelete] == N[after bulkdelete] + N[deleted]




--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Taral
On 4/27/06, Tom Lane <[EMAIL PROTECTED]> wrote:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > There are other ways of achieving the same thing. Structs containing a union
> > for the subclass fields for example.
>
> Doesn't achieve the same thing, unless you mandate that every part of
> the system use the identical massively-overloaded union struct to refer
> to every node.

If we do subclassing like this:

struct Node { ... };
struct Value { struct Node; ... };
etc.

do we still run into the alias problem?

--
Taral <[EMAIL PROTECTED]>
"You can't prove anything."
-- Gödel's Incompetence Theorem

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Taral
On 4/27/06, Zeugswetter Andreas DCP SD <[EMAIL PROTECTED]> wrote:
> Can you please explain what exactly was not working ?
> xlc has in the past shown warnings that were actually problematic code
> that gcc did not show (and the cc variant of xlc also does not show).

This has nothing to do with warnings. With xlc version 6, this code:

Value *
makeString(char *str)
{
Value  *v = makeNode(Value);

v->type = T_String;
v->val.str = str;
return v;
}

Will return objects whose "type" field is T_Value (650), because the
compiler reorders the assignment that makeNode makes with that of the
main function.

--
Taral <[EMAIL PROTECTED]>
"You can't prove anything."
-- Gödel's Incompetence Theorem

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 3.

2006-04-27 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Teodor Sigaev wrote:
>> We appreciate any comments, help and suggestions. For third item we haven't 
>> idea how fix it except to exclude GIN index from check.

> How about adding a column to pg_am indicating "these indexes must always
> keep same tuple count as heap".  This would be true for all current AMs,
> false for GIN.

There's a definitional issue here, which is what does it mean to be
counting index tuples.  I think GIN could bypass the VACUUM error check
by always returning the heap tuple count as its index tuple count.  This
would result in the index's reltuples field getting set to that value
rather than the number of index entries, but arguably that's what we
want anyway.  From what I recollect of the planner's use of index
reltuples, values greater than heap tuple count would not behave sanely:
it considers index.reltuples to be an upper bound on the number of rows
an indexscan could fetch.

The ideal thing would be for GIN to return a count of the number of
distinct heap tuples referenced by the entries in the index, but I
suppose that would be impractical for VACUUM to compute.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes:
> Martijn van Oosterhout  writes:
>> You're right, PostgreSQL uses a form of subclassing, so that a (for
>> example) struct ArrayRefExprState is occasionally referred to using a
>> (struct ExprState*) or even a (struct Node*). In C, the logical way to
>> get that to work it by casting, do you have a better way?

> There are other ways of achieving the same thing. Structs containing a union
> for the subclass fields for example.

Doesn't achieve the same thing, unless you mandate that every part of
the system use the identical massively-overloaded union struct to refer
to every node.  That would (a) defeat the purpose of extensibility, and
(b) make the code more error prone not less so (since it'd be
notationally easy to refer to a field that's not actually present in the
given node subtype).

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 3.

2006-04-27 Thread Alvaro Herrera
Teodor Sigaev wrote:

>   * GIN stores several ItemPointer to heap tuple, so VACUUM FULL produces
> this warning message:
>  WARNING:  index "idx" contains 88395 row versions, but table contains
> 51812 row versions
>  HINT:  Rebuild the index with REINDEX.
> 
> We appreciate any comments, help and suggestions. For third item we haven't 
> idea how fix it except to exclude GIN index from check.

How about adding a column to pg_am indicating "these indexes must always
keep same tuple count as heap".  This would be true for all current AMs,
false for GIN.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Greg Stark
Martijn van Oosterhout  writes:

> You're right, PostgreSQL uses a form of subclassing, so that a (for
> example) struct ArrayRefExprState is occasionally referred to using a
> (struct ExprState*) or even a (struct Node*). In C, the logical way to
> get that to work it by casting, do you have a better way?

There are other ways of achieving the same thing. Structs containing a union
for the subclass fields for example.

-- 
greg


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 3.

2006-04-27 Thread Teodor Sigaev


I took the liberty of revising your README.txt as a native speaker :)  I 
  cleaned up the grammar a lot, etc.


Thank you very much. I added your README to patch.

New version of GIN is available:

http://www.sigaev.ru/gin/gin.gz
http://www.sigaev.ru/gin/README.txt

Changes from Try 2:
* add regression tests for &&,@,~ operators
* add regression tests for GIN over int4[] and text[]
* fix regression opr_sanity test
* update README ( by Christopher )

Open Items:
  * Teach optimizer/executor that GIN is intrinsically clustered. i.e., it
always returns ItemPointer in ascending order.
  * Tweak gincostestimate.
  * GIN stores several ItemPointer to heap tuple, so VACUUM FULL produces
this warning message:
 WARNING:  index "idx" contains 88395 row versions, but table contains
51812 row versions
 HINT:  Rebuild the index with REINDEX.

We appreciate any comments, help and suggestions. For third item we haven't idea 
 how fix it except to exclude GIN index from check.


Sorry for our persistence, but we really need to known about choice of community 
about commiting or making contrib, because it will be difficult to support a big 
enough patch up to date...


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Jonah H. Harris
On 4/27/06, Tom Lane <[EMAIL PROTECTED]> wrote:
> ... and is far more maintainable than an RD parser, and is not a
> performance bottleneck.  I've never seen yyparse occupy as much as 2%
> of a backend profile ...

Not more maintainable by any stretch of the imagination.  For example,
try and remove the AS alias for columns in our bison grammar without
having to remove keywords from unreserved_keywords :)  In PCCTS (or a
hand-written RD parser), it's VERY easy because you can control the
lookahead as necessary.

Now, in terms of performance, it's hard to beat a LALR parser, but LL
parsers are comparable if done correctly; especially if written by
hand.  Besides, parsing itself isn't what kills us, it's the lack of
caching unprepared statements.  Yes, this is another topic in and of
itself, but I know there was discussion about it between you and Neil;
did anything ever come of it?

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Jonah H. Harris
On 4/27/06, Stephen Frost <[EMAIL PROTECTED]> wrote:
> The answer to that can certainly be "performance" provided other factors
> (such as maintainability) don't change much.  If you could show that
> then I think such a switch would be very seriously considered.

IMHO, switching parser-types (and parser generators) is more about
maintainability than performance itself.  SQL is much nicer in
recursive descent where you don't have yacc/bison limitations such as
1 token of lookahead and non-ebnf grammars.  The sort-of odd thing is
that PCCTS (like its much younger brother ANTLR) is intended on
generating ASTs whereas yacc/bison requires you to build the parse
tree manually (as we do).

Don't get me wrong, this was taken into consideration with PCCTS, but
it's not as optimal or beautiful as it would be to have PCCTS itself
generate the parse tree.  Still, it's nicer to maintain than a
yacc/bison grammar.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Tom Lane
"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> Unfortunately, this discussion usually ends up with, "why would we
> want to change what we have now when it already works?"

... and is far more maintainable than an RD parser, and is not a
performance bottleneck.  I've never seen yyparse occupy as much as 2%
of a backend profile ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Martijn van Oosterhout
On Thu, Apr 27, 2006 at 11:08:59AM -0400, [EMAIL PROTECTED] wrote:
> ... If
> PostgreSQL isn't alias safe - it means that there is casting
> happening, that is not generally accepted as valid, minimizing the
> optimization potential of the code, and certainly not guaranteed to be
> correct. Every cast in C has the potential to contain bugs today, or
> tomorrow when somebody changes the lvalue or rvalue. If the types are
> incompatible, according the compiler, the odds of a bug happening today
> or tomorrow become greater.

You're right, PostgreSQL uses a form of subclassing, so that a (for
example) struct ArrayRefExprState is occasionally referred to using a
(struct ExprState*) or even a (struct Node*). In C, the logical way to
get that to work it by casting, do you have a better way?

The fact is, strict-aliasing only breaks things in the name of
performence.

> I think we can agree on the intrusiveness of the change needing to
> be linear or less to the value of the change. If the fix is simple,
> and it generates a 1% - 2% gain - you are buying both performance,
> and correctness. Personally, I value correctness. I think some would
> value performance though.

The fix is not simple. We wouldn't have disabled it if the fix was
easy. Basically, with strict-aliasing enabled, things break. The
compiler won't identify all the issues for you, how do you know you're
safe? So the system is actually *more* correct with strict-aliasing
disabled, because then it does exactly what the code says...

The two major other places we use casts are converting Datums to their
actual types, and our MemSet which replaces the braindead GCC one.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Zeugswetter Andreas DCP SD

> I ran afoul of these rules the other day when compiling pgsql 8.1 on
> AIX. The configure scripts are set up to look for "xlc" instead of
> "cc", and that command invokes cc with "-qalias=ansi", the ANSI-strict
> pointer aliasing mode.

Can you please explain what exactly was not working ?
xlc has in the past shown warnings that were actually problematic code
that gcc did not show (and the cc variant of xlc also does not show).

Andreas

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Stephen Frost
* Jonah H. Harris ([EMAIL PROTECTED]) wrote:
> Unfortunately, this discussion usually ends up with, "why would we
> want to change what we have now when it already works?"

The answer to that can certainly be "performance" provided other factors
(such as maintainability) don't change much.  If you could show that
then I think such a switch would be very seriously considered.  Of
course, the performance improvment would have to be some substantial 
amount (I would guess at least 5%, maybe 10%) to warrent the learning
curve associated with changing it.

If you're not interested or not able to do the performance comparison
then I guess you'd need to find someone who is and then work out if
they'd be willing to do the testing under some NDA in case it doesn't
pan out.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread mark
On Thu, Apr 27, 2006 at 10:49:02AM -0400, Tom Lane wrote:
> [EMAIL PROTECTED] writes:
> > Even if it was only 1% - 2%. Isn't it worth it? :-)
> No.  According to the ancient saying, "I can make this program
> arbitrarily fast ... if it doesn't have to give the right answer".
> Percentage-point improvements are not worth the risk of introducing
> hard-to-find, compiler-and-hardware-dependent bugs.

Addressing the aliasing rule concerns raised by the compiler, at least
in theory, is improving the compatibility of the program, and reducing
the number of bugs that may be exposed under unusual circumstances. If
PostgreSQL isn't alias safe - it means that there is casting
happening, that is not generally accepted as valid, minimizing the
optimization potential of the code, and certainly not guaranteed to be
correct. Every cast in C has the potential to contain bugs today, or
tomorrow when somebody changes the lvalue or rvalue. If the types are
incompatible, according the compiler, the odds of a bug happening today
or tomorrow become greater.

If you are worried about PostgreSQL making use of the optimization
potential - and suggesting that by PostgreSQL making this impossible
currently, that you are preventing a compiler-and-hardware-dependent
bug, I strongly disagree. If most code is compiled with the aliasing
rules enabled in the optimizer, and your code is compiled with the
aliasing rules disabled, then you are using the lesser used
optimization paths in the compiler, actually raising the likelihood of
such a compiler bug being exposed. Compilers do have bugs - and by
choosing to use the non-standard optimization models, the chance of
you finding it first increases. You don't really want to be first.
You want the regression testing of the compiler to be first. :-)

> Show me how to find/prevent those bugs, and I'm all for going with
> the stricter rules.  But you're so far off base with the above argument
> that I wonder whether we understand each other at all.

I think we can agree on the intrusiveness of the change needing to
be linear or less to the value of the change. If the fix is simple,
and it generates a 1% - 2% gain - you are buying both performance,
and correctness. Personally, I value correctness. I think some would
value performance though.

We're both playing opposite roles, to lead to a balanced discussion,
which is good. Not off base. Contrary. How could you expect to have
a balanced discussion otherwise? :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Tom Lane
[EMAIL PROTECTED] writes:
> Even if it was only 1% - 2%. Isn't it worth it? :-)

No.  According to the ancient saying, "I can make this program
arbitrarily fast ... if it doesn't have to give the right answer".
Percentage-point improvements are not worth the risk of introducing
hard-to-find, compiler-and-hardware-dependent bugs.

Show me how to find/prevent those bugs, and I'm all for going with
the stricter rules.  But you're so far off base with the above argument
that I wonder whether we understand each other at all.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread mark
On Thu, Apr 27, 2006 at 12:52:42PM +0200, Martijn van Oosterhout wrote:
> > Next time we have this discussion I wish someone would actually document 
> > the performance differences. IIRC most of what I have seen makes it at 
> > best marginal.
> I can imagine there are cases where the performance difference is
> nontrivial. Take this (somewhat contrived) example:
> int *i;
> char *c;
> while( *i < BIG_NUMBER )
>   *i += *c;
> With strict aliasing, the compiler need only load *c once, without it
> needs to load it each time through the loop because it has to consider
> the possibility that 'i' and 'c' point to the same memory location.

> PostgreSQL doesn't actually have loops of this kind so it's not
> something we need worry about. And you can acheive all the benefits by
> ...

PostgreSQL might not - however PostgreSQL does use GLIBC, which does have
inlined or preprocessor defined code.

I haven't done the timings myself - but I think the optimization would
in theory apply to wider code than just the above. *c doesn't need to
be constant. It can be moving, as would be in the case of an strcpy()
or memcpy() implementation. As you pointed out, such things as
auto-vectorization become impossible if it cannot guarantee that the
data is different. The aliasing rules are one part, the 'restrict'
keyword is the more important part I would imagine. Not dissimilar to
the C compiler auto-assigning variables into registers, but allowing
for the designer to hint using the 'register' keyword.

In the modern day, I see fewer and fewer people using 'register', as
not only does the compiler tend to get it right, but the compiler
may actually do a better job. I could see the same thing being
true of auto-detect using aliasing rules vs 'restrict' keyword usage.

Even if it was only 1% - 2%. Isn't it worth it? :-) Especially for a
practice, that under existing code, has implementation defined
semantics. It might not work in the future, or with a new optimizer
mode that comes out, or a new platform...

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Tom Lane
Martijn van Oosterhout  writes:
> That's right, except I read "object", not "primative type". The
> question revolves a bit around what an object is. This discussion on
> the GCC lists [1] suggests that the syntax a->b is merely syntactic
> sugar for (*a).b and thus the "object" being accessed is (*a), the type
> of b is not relevent to the decision.

The part of the spec that I'm looking at says

   [#7] An object shall have its stored value accessed only  by
   an lvalue expression that has one of the following types:61)

 -- a  type  compatible  with  the  effective  type  of the
object,

 -- a qualified version  of  a  type  compatible  with  the
effective type of the object,

 -- a   type   that   is   the   signed  or  unsigned  type
corresponding to the effective type of the object,

 -- a  type  that  is   the   signed   or   unsigned   type
corresponding  to  a qualified version of the effective
type of the object,

 -- an aggregate or union type that  includes  one  of  the
aforementioned  types  among  its  members  (including,
recursively, a member of a  subaggregate  or  contained
union), or

 -- a character type.

   61)The intent of this list is to specify those circumstances
  in which an object may or may not be aliased.

Which wouldn't be especially interesting, except for that footnote
(which in fact is one of only two uses of "alias" in the document;
there isn't any other discussion about aliasing at all).

As I read this, the aliasing rules are driven by the type of the
lvalue being fetched or assigned.  Thus, when you fetch or assign a
whole struct, your reading would be correct, but not for a fetch
or assignment of a single struct field.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Jonah H. Harris
On 4/27/06, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> We talked about it when GCC announced their switch.  The conclusion was
> that our grammar is still too much a moving target, so it would be too
> difficult to mantain such a grammar.

For the sake of saying again, I already have a recursive-descent
parser for PostgreSQL written in a PCCTS grammar.  It's something I
started writing years ago, but I'd be willing to consider open
sourcing it if the PostgreSQL community will really entertain the
thought of switching.

Unfortunately, this discussion usually ends up with, "why would we
want to change what we have now when it already works?"

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Alvaro Herrera
Jesper Pedersen wrote:

> I have been thinking about this for a while and now that Google Summer of Code
> is coming I thought I would share this idea.
> 
> The GCC people have traded their bison/flex parser with a hand written
> recursive-descent parser for a nice speed up.
> 
> So it would be interesting to see if PostgreSQL would benefit from the same
> 'switch'.

We talked about it when GCC announced their switch.  The conclusion was
that our grammar is still too much a moving target, so it would be too
difficult to mantain such a grammar.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Summer of Code idea

2006-04-27 Thread Martijn van Oosterhout
On Tue, Apr 25, 2006 at 10:30:26PM +0200, Jesper Pedersen wrote:
> Hi.
> 
> I have been thinking about this for a while and now that Google Summer of Code
> is coming I thought I would share this idea.
> 
> The GCC people have traded their bison/flex parser with a hand written
> recursive-descent parser for a nice speed up.

Nice? The figures I'm seeing are 2%. Is that even noticable?

> So it would be interesting to see if PostgreSQL would benefit from the same
> 'switch'.
> 
> By the looks of it *) the job could be completed within the time frame and
> maybe pgbench could serve as the testing framework for the performance
> measurements.

I think it's worth a try, but you have to consider that unlike C, SQL
as a language keeps changing in ways we have no idea about yet.
Whatever the result is, it has to be more maintainable than what we
have now...

> I think it has an academic angle to it -- something fun for the student and
> maybe a speed up for PostgreSQL :)

Abosolutly. It'd be a fun experiment, if one were so inclined.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


[HACKERS] Summer of Code idea

2006-04-27 Thread Jesper Pedersen
Hi.

I have been thinking about this for a while and now that Google Summer of Code
is coming I thought I would share this idea.

The GCC people have traded their bison/flex parser with a hand written
recursive-descent parser for a nice speed up.

So it would be interesting to see if PostgreSQL would benefit from the same
'switch'.

By the looks of it *) the job could be completed within the time frame and
maybe pgbench could serve as the testing framework for the performance
measurements.

I think it has an academic angle to it -- something fun for the student and
maybe a speed up for PostgreSQL :)

Just an idea...

Keep up the great work you are doing !

Best regards,
 Jesper

*) http://gcc.gnu.org/ml/gcc-patches/2004-10/msg01969.html


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] test

2006-04-27 Thread Dave Cramer
I've sent 3 previous messages to the list, and none of  them have  
arrived, or been bounced.


Dave

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Martijn van Oosterhout
On Thu, Apr 27, 2006 at 07:34:19PM +0930, Andrew Dunstan wrote:
> Next time we have this discussion I wish someone would actually document 
> the performance differences. IIRC most of what I have seen makes it at 
> best marginal.

I can imagine there are cases where the performance difference is
nontrivial. Take this (somewhat contrived) example:

int *i;
char *c;
while( *i < BIG_NUMBER )
*i += *c;

With strict aliasing, the compiler need only load *c once, without it
needs to load it each time through the loop because it has to consider
the possibility that 'i' and 'c' point to the same memory location.

PostgreSQL doesn't actually have loops of this kind so it's not
something we need worry about. And you can acheive all the benefits by
explicitly loading *c into a local variable before the loop. I can
beleive that certain RISC architectures would benefit more than
something like the Intel CISC architecture.

> Personally, I think this whole mess results from a bad case of 
> committee-itis.

I think the goal was noble (at least, according to the best version
I've heard so far): make it easier to compete with Fortran in numerical
processing. If you define a struct vector { double x,y,z,d; } and the
compiler can assume that values in that structure can only be changed
via a (vector*) pointer, it can do things like load an array of them
into a large parallel processing unit.

Ofcourse, this is useless for most of the programs written in C, and in
C99 they fixed it the right way using the "restrict" keyword, which is
actually far more usful for the above purpose than strict-aliasing.

Compiler writers love it because it makes it easier for them, but
actual benefits, hmm...

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


[HACKERS] AMD64 dual core mutex/spinlock problem

2006-04-27 Thread Zdenek Kotala

Some AMD64 dual core processor series has problem with atomic operation
"lock cmpxchg". It does not work correctly in some case. (I don't know
if it is bug or feature.) This problem occurred in the solaris on the
AMD64 platform. Postgres implement own lock mechanism with similar code
"lock xchg". I am not sure if this is wrong or not. My question is, If
someone has problem with spinlock on AMD64 dual core in 64-bit mode.


  - Zdenek

PS: You can see Solaris mutex_enter code on
http://cvs.opensolaris.org/source/xref/on/usr/src/uts/intel/ia32/ml/lock_prim.s 
Extra lfence instruction after atomic operation fix the problem.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Andrew Dunstan

Taral wrote:


4. Find the option for disabling strict alias and get configure to add
that.
   



You'll still lose performance, but the option is "-qalias=noansi".


 



Next time we have this discussion I wish someone would actually document 
the performance differences. IIRC most of what I have seen makes it at 
best marginal.


Personally, I think this whole mess results from a bad case of 
committee-itis.


cheers

andrew


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Zeugswetter Andreas DCP SD

> > 4. Find the option for disabling strict alias and get configure to
add
> > that.
> 
> You'll still lose performance, but the option is "-qalias=noansi".

My old xlc does not show that option, it is unfortunately version
specific.
The currently compatible option to turn it off would be -qnoansialias

So we can use:
xlc -qnoansialias

The default cc options are: -qlanglvl=extended -qnoro -qnoroconst
So I guess we could also use (but above is imho clearer/better): 
cc -qro -qroconst -qlanglvl=extc89

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ANSI-strict pointer aliasing rules

2006-04-27 Thread Martijn van Oosterhout
On Wed, Apr 26, 2006 at 08:13:21PM -0400, Tom Lane wrote:
> Actually, if xlc is behaving as Taral says then I'm pretty convinced
> it's *broken*; it is assuming far more than is allowed even by the ANSI
> aliasing rules.  As I read the spec, ANSI aliasing says that a given
> value must always be accessed through equivalent (up to signedness)
> primitive types, ie, if you store through an int pointer and fetch
> through a long pointer the compiler is allowed to reorder those two
> references. 

That's right, except I read "object", not "primative type". The
question revolves a bit around what an object is. This discussion on
the GCC lists [1] suggests that the syntax a->b is merely syntactic
sugar for (*a).b and thus the "object" being accessed is (*a), the type
of b is not relevent to the decision.

The standard [2] (at least that version) says:

3.14 object
region of data storage in the execution environment, the contents of
which can represent values

So is not limited to primitive types.

> In the example Taral gives, both field references are to
> fields of type NodeTag.  I don't see anything in the spec that allows
> the compiler to assume they are distinct variables just because they are
> members of different struct types.  The spec restriction is defined in
> terms of the lvalue type of the particular store or fetch access, not on
> what kind of structure it's part of.

Well, I imagine it doesn't help that the result of malloc(), which
normally can't alias anything, is assigned to a global variable of a
particular type, and subsequently cast to its actual type.

However, the posters original example doesn't exist in the current
codebase (we never assign T_String to a tag field, only to the type
field), so wherever the problem is, it's not here. At the end of the
day, our use of pointer casts makes the strict-aliasing rules a risk so
we're hardly likely to enable it anytime soon.

Have a nice day,

[1] http://gcc.gnu.org/ml/gcc/2003-02/msg01438.html
[2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

Quotes from standard:

6.5 Expressions
7 An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:



6.3.2.1 Lvalues, arrays, and function designators

An lvalue is an expression with an object type or an incomplete type
other than void;
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature