Re: [HACKERS] Hot Standby dev build (v8)

2009-01-18 Thread Heikki Linnakangas

Simon Riggs wrote:

On Fri, 2009-01-16 at 22:09 +0200, Heikki Linnakangas wrote:

RecentGlobalXmin is just a hint, it lags behind the real oldest xmin 
that GetOldestXmin() would return. If another backend has a more recent 
RecentGlobalXmin value, and has killed more recent tuples on the page, 
the latestRemovedXid written here is too old.

What do you think we should do instead?

Dunno. Maybe call GetOldestXmin().


We are discussing btree deletes, not btree vacuums. 


Pardon my ignorance, but what's the difference?


If we are doing
btree delete then we have an unreleased snapshot therefore we also have
a non-zero xmin. How can another backend have a later RecentGlobalXmin
or result from GetOldestXmin() than we do?


Sure it can, for example:

1. Transaction 1 begins in backend A
2. Transaction 2 begins in backend B, xmin = 1
3. Transaction 1 ends
4. Transaction 3 begins in backend C, xmin = 2
5. Backend C gets snapshot, TransactionXmin = 2, RecentGlobalXmin = 1
6. Transaction 2 ends.
7. Transaction 4 begins in backend A, gets snapshot TransactionXmin = 2, 
RecentGlobalXmin = 2

8. Transaction 4 kills tuple, using its RecentGlobalxmin of 1
9. Transaciont 3 splits the page, emits a delete xlog record, setting 
latestRemovedXid to its RecentGlobalXmin of 2


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] GIN improvements

2009-01-18 Thread Jeff Davis
On Fri, 2009-01-16 at 15:39 +0300, Teodor Sigaev wrote:
> > START_CRIT_SECTION();
> > ...
> > l = PageAddItem(...);
> > if (l == InvalidOffsetNumber)
> > elog(ERROR, "failed to add item to index page in \"%s\"",
> >  RelationGetRelationName(index));
> > 
> > It's no use using ERROR, because it will turn into PANIC, which is
> I did that similar to other GIN/GiST places. BTW, BTree  directly emits PANIC 
> if
> PageAddItem fails
> 

I'd still prefer PANIC over an ERROR that will always turn into a PANIC.
I'll leave it as you did, though.

> > 
> > 4. Heikki mentioned:
> > http://archives.postgresql.org/pgsql-hackers/2008-11/msg01832.php
> > 
> > "To make things worse, a query will fail if all the matching 
> > fast-inserted tuples don't fit in the non-lossy tid bitmap."
> > 
> > That issue still remains, correct? Is there a resolution to that?
> 
> Now gincostestimate can forbid index scan by disable_cost (see Changes). Of 
> course, it doesn't prevent failure in case of large update (for example), but 
> it 
> prevents in most cases. BTW, because of sequential scan of pending list cost 
> of 
> scan grows up fast and index scan becomes non-optimal.

Is this a 100% bulletproof solution, or is it still possible for a query
to fail due to the pending list? It relies on the stats collector, so
perhaps in rare cases it could still fail?

It might be surprising though, that after an UPDATE and before a VACUUM,
the gin index just stops working (if work_mem is too low). For many
use-cases, if GIN is not used, it's just as bad as the query failing,
because it would be so slow.

Can you explain why the tbm must not be lossy?

Also, can you clarify why a large update can cause a problem? In the
previous discussion, you suggested that it force normal index inserts
after a threshold based on work_mem:

http://archives.postgresql.org/pgsql-hackers/2008-12/msg00065.php

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-18 Thread James Mansion

Andrew Chernow wrote:

m$ docs indicate that wsastartup can't be called from dllmain :(

OK, fair cop.  Says it in the MSDN online version but not in the SDK 6.1 
version. :-(  Some helper(s)

must start threads I guess.

Re the counting and doing it on first/last socket - of course WSAStartup 
counts internally. Prsumably

its only slow when the count is actually going to zero?

Is there a need for a new API to control this - can't you just interpret 
another parameter keyword
in PQconnectdb (or the pgoptions string in PQsetdbLogin I guess)?  
(Having said that, how
do you control send and receive buffer sizes?  Presumably nagle is 
always disabled, but those
features could be controlled the same way?  Would presumably allow 
PGOPTIONS to be
parsed too, though docs 30.12 says that does runtime options for the 
server, though
PQconnectdb in 30.1 suggests that it might be parsed for the client 
connect too.  Maybe.).


James


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: B-Tree emulation for GIN

2009-01-18 Thread Jeff Davis
On Fri, 2009-01-16 at 17:42 +0300, Teodor Sigaev wrote:
> > Unfortunately, that means numeric is not supportable for btree-gin
> > because there is no left-most value from which to start the scan. Do you
> > see an easy workaround to support numeric?
> Hmm, let we use minimal varlena struct (with size equal to VARHDRSZ) as 
> left-most (and fake) value. It is never used for any actual argument except 
> compare function, so, new compare function is provided. New version of patch 
> is 
> attached and it based on on your patch (btree-gin.20090111.patch.gz).
> 

Looks good to me. I updated the CommitFest wiki to point to this patch.

I like the fact that this patch does not modify the numeric ADT. It
still relies on the fact that the numeric type will never make use of
the minimal varlena struct, however. I bring this up in case someone
sees it as a problem.

Thanks,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] join removal

2009-01-18 Thread Robert Haas
Apologies for submitting new patches while we're still in the midst of
CommitFest:November, but I think I've more or less come to the end of
the reviewing that I can usefully do for 8.4 (or at least, I haven't
had any requests lately, feel free...) and I don't want to wait
forever to start thinking about 8.5.

This patch is based on Simon Riggs' earlier work on join removal,
although I don't believe I've actually used any of his code.  Tom
Lane's review comments on Simon's code were invaluable in getting this
to work.  I also understand that Simon plans to do more work in this
area, so if this ends up getting sucked into his work or visca versa
(or this ends up getting superseded altogether), that is fine.

http://archives.postgresql.org/pgsql-patches/2008-08/msg00035.php
http://archives.postgresql.org/pgsql-patches/2008-09/msg00024.php

The theoretical underpinning of this patch is as follows: Joins can
affect the result of a query in one of two ways: either they change
the set of rows returned (by either increasing it or decreasing it),
or they provide attributes.  To remove a join, we need to prove that
neither of these things is possible.  It's pretty easy to test that
the join isn't providing any attributes above the level of the join,
and you'll see that rel_attrs_needed_above_join() handles that here.
Verifying that the number of rows returned isn't changed by the join
is harder.

For the sake of simplicity, this patch only attempts to handle LEFT
JOINs.  Specifically, it attempts to prove that the results of
scanning the outer side of the joinrel will be identical to the
results of performing a merge-join, up to sorting.  Nothing that
happens on the inner side of the join can reduce the number of output
rows (except via a WHERE-clause qual that will fail the
rel_attrs_needed_above_join test anyway), so the only thing we need to
worry about is the possibility that the join might increase the output
row count.  To do that, we need to know that there will be a
sufficiently strong unique-ness property on the rows emitted by the
inner side of the join.  I believe that there are two main cases in
which we could potentially prove this: the inner side of the join is
an index scan (without resort) on a unique index, or the inner side of
the join is a subselect with a group by clause over the join columns.
For now, I'm only attempting to handle the first case, though it might
be nice to eventually handle the other as well.

Simon's original patch starts by checking whether the RelOptInfo for
the relation to be dropped is a base relation, but that approach seems
somewhat limiting, because we certainly want to be able to remove
multiple joins in succession.  Setting the paths for the joinrel {B C}
to the paths for B doesn't help us when we try to form {A B C},
because now the join of {A} to {B C} is not a join against a base rel
and join removal fails.  The other problem is that at that level we
don't really understand what's up with the join clauses: is the join
operator even an equality operator?  There is finally the problem that
while we can get at the relevant IndexOptInfo structures and figure
out which combinations of attributes have unique indices, I can't
figure out any sane way to relate those attribute numbers back to the
join clauses.

So I've taken the alternative approach of working on the level of
paths.  sort_outer_and_inner(), before forming a join path, checks
whether there happens to be a unique index with the same pathkeys as
the inner side of the path.  If so, and if no attributes are needed
from the inner side of the join, the it pulls up all the paths from
the outerrel into the joinrel and exits (really, it should skip the
rest of add_paths_to_joinrel() as well, but it didn't seem worth
refactoring that without some validation of the basic approach).  This
approach is limiting because sort_outer_and_inner() doesn't try every
possible combination of pathkeys that might be helpful to us, but it
seems likely to be adequate here for the same reasons that
sort_outer_and_inner() gets away with it generally: the list of
mergeclauses tends to be real short.

(I think that if we were trying to optimize away an INNER join, this
approach would be inadequate, because any non-merge-joinable join
clauses could change the number of output rows.  Here that can't
happen: the most they can do is force the inner side to NULL, but if
the inner side isn't being used that doesn't matter anyway.  Of
course, for INNER JOINs, we'd also need to worry about the
merge-joinable clauses themselves stripping rows from the output due
to a failure to match; we'd need to check for foreign key
constraints.)

I have a feeling there are probably lingering bugs in here, but it
passes regression tests and some sample queries I tried still return
the correct answers even after nuking one or more joins.

Any comments appreciated,

...Robert
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c

Re: [HACKERS] Visibility map and freezing

2009-01-18 Thread ITAGAKI Takahiro

Gregory Stark  wrote:

> >> I don't think we can perfectly capture the meaning of these GUCs in the
> >> name. I think our goal should be to avoid confusion between them.
> 
> I was thinking it would be clearer if the options which control *when*
> autovacuum fires off a worker consistently had some action word in them like
> "trigger" or "start" or "launch".

I think we need more explanations about those variables,
not only "how to work" but also "how to tune" them.
I feel they are un-tunable parameters.

Our documentation says:
| Larger values of these settings
| preserve transactional information longer, while smaller values increase
| the number of transactions that can elapse before the table must be
| vacuumed again.
i.e, we are explaining the variables only as "Larger is better",
but is it really true?

I think we should have answers about the following questions:

- What relation are there between autovacuum_freeze_max_age,
  vacuum_freeze_min_age and vacuum_freeze_table_age? If we increase
  one of them, should we also increase the others?

- Is it ok to increase the variables to maximum values?
  Are there any trade-off?

- Are there some conditions where whole-table-scanning vacuum is more
  effective than vacuums using visibility map? If so, we should switch
  to full-scan *automatically*, without relying on user configurations.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-18 Thread Andrew Chernow

James Mansion wrote:


If you have a DLL for libpq, could you do it in process attach and 
detach?  Wouldn't

that be the most common case anyway?




m$ docs indicate that wsastartup can't be called from dllmain :(

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-18 Thread James Mansion

Magnus Hagander wrote:

The use-case of rapidly creating and dropping connections isn't
particularly common, I think. And there is a perfectly functioning
workaround - something that we should perhaps document in the FAQ or
somewhere in the documentation?
  
Would it be accetable to do initialise if the number of connections is 
changing from 0, and
tidy if the cumber goes back to 0?  Applications that retain a 
connection would not

suffer the cost on subsequent connect/disconnect.

The init/term is the tidiest way to do it, but the above might help - 
perhaps init could just

add a phantom usecount and work the same way.

If you have a DLL for libpq, could you do it in process attach and 
detach?  Wouldn't

that be the most common case anyway?

James


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump versus views and opclasses

2009-01-18 Thread Brendan Jurd
On Mon, Jan 19, 2009 at 7:47 AM, Tom Lane  wrote:
> I've applied a patch for this to HEAD and 8.3.
>

Cool, thanks Tom.

Just noting that I've tested your fix on both branches, and in both
cases pg_dump's output came out nice, clean and consistent.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump versus views and opclasses

2009-01-18 Thread Tom Lane
"Brendan Jurd"  writes:
> On Sun, Jan 18, 2009 at 5:56 AM, Tom Lane  wrote:
>> Interesting --- it works as expected in 8.2.  The problem seems to have
>> been created by the introduction of arrays over composites in 8.3.

> Congratulations on identifying the source!

I've applied a patch for this to HEAD and 8.3.

>> I'm thinking that adding operator family info to SortGroupClause in
>> HEAD might be a good thing too, but it is for a different issue than
>> this --- it's to prevent you dropping a family that a view has a
>> semantic dependency on.  However there might be room for a bit of
>> argument there: if there's both a btree and a hash opfamily then
>> the view could use either, so preventing you dropping one would be
>> a bit arbitrary.

> Not to mention that, with the new default btree opclass in 8.4 that I
> stumbled across earlier, dropping an opclass that a view uses may not
> necessarily murder the view ... it may simply cause it to fall back to
> the default opclass.  If falling back to the default opclass was
> actually what the user wanted to achieve by dropping the user-defined
> opclass, a dependency such as you describe might leave him with no way
> to do that.

> Worst case scenario is that you drop an opclass which causes a view to
> die, and next time you try to run the view you get an intelligible
> error message telling you why it doesn't work anymore.  That's
> probably not the end of the world.

True.  I'll leave well enough alone for the moment, unless someone
comes up with additional arguments.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-18 Thread Andrew Chernow

Personally, I don't think its the job of libpq to call wsa startup or
shutdown.  Pulling it out now will surely break existing apps and piss


I'm afraid it is. Looking at the API docs, the very first paragraph says:
"The WSAStartup function must be the first Windows Sockets function
called by an application or DLL. It allows an application or DLL to
specify the version of Windows Sockets required and retrieve details of
the specific Windows Sockets implementation. The application or DLL can
only issue further Windows Sockets functions after successfully calling
WSAStartup."




I just think libpq should provide a way of turning off built in startups. 
Outside of my original per-conn performance concern, an application may want 
libpq to use a version of winsock that is different than the one libpq is using. 
 After reading the very handy doc blurb you so graciously supplied, it appears 
to be one of the main reasons wsastartup exists ;-)


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-18 Thread Andrew Chernow

Tom Lane wrote:

Magnus Hagander  writes:

Yeah. So the question is, do we want to bite the bullet and create
init() and uninit() functions for libpq?


I think if calling them is an optimization that improves performance
(by eliminating per-connection-start overhead), this could fly.  If
the plan is "we are going to require applications to call these
or they'll break", it's not going to be acceptable ...

regards, tom lane



My suggestion was to make calling the init/uninit optional (well uninit should 
only be optional if init was not called).  I think libpq should behave 
identically if init() is never called.  What init() gets you is the ability to 
fine tune libpq (change the default behaviors).  For instance: a bit mask 
argument to init() called "options", that allows one to toggle things on/off in 
libpq: like PG_OPT_NOWSAINIT or PG_OPT_NOSSLINIT.  It may requrie something like 
to be expandable:


int
PQinit(int info_type, void *info_struct, int options);

I'm just spit-ball'n here.  My point is, its could be a good place to allow run 
time configuration of libpq.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Peter Eisentraut
On Sunday 18 January 2009 12:43:46 Grzegorz Jaskiewicz wrote:
> > -Wformat-security warns about
> >
> >printf(var);
> >
> > but not about
> >
> >printf(var, a);
> >
> > I don't understand that; the crash or exploit potential is pretty
> > much the
> > same in both cases.
>
> not at all. First case allows you to pass in var from outside, with
> your, well crafted format strings. Please read more about subject,
> before you say something that silly.

If your premise is that var is passed in from the outside, then the real issue 
is the %n placeholder.  And then it doesn't matter how many variadic args you 
pass.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Peter Eisentraut
On Sunday 18 January 2009 21:15:28 Tom Lane wrote:
> BTW, does the gettext infrastructure make any checks to ensure that
> translators didn't bollix the format codes?  It seems like that should
> be doable with just a SMOP, but I don't know if it's in there or not.

Yes, that is all taken care of.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-18 Thread Tom Lane
Magnus Hagander  writes:
> Yeah. So the question is, do we want to bite the bullet and create
> init() and uninit() functions for libpq?

I think if calling them is an optimization that improves performance
(by eliminating per-connection-start overhead), this could fly.  If
the plan is "we are going to require applications to call these
or they'll break", it's not going to be acceptable ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Tom Lane
Gregory Stark  writes:
> Tom Lane  writes:
>> The really nasty cases are like this:
>> const char *myfmt = gettext_noop("Some bleat about object \"%s\".");
>> ...
>> errmsg(myfmt, objectname)

> It makes sense to me: if you have arguments for the format string then
> presumably you've at some point had to check that the format string has
> escapes for those arguments.

Actually, there was just an issue in the open patch for column
privileges where Stephen had added a format string that failed to match
the arguments that would be supplied.  What'd be really useful is some
way to tie the constants themselves to the errmsg call for error
checking purposes ... can't see a decent way to do it though.

BTW, does the gettext infrastructure make any checks to ensure that
translators didn't bollix the format codes?  It seems like that should
be doable with just a SMOP, but I don't know if it's in there or not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-18 Thread Magnus Hagander
Andrew Chernow wrote:
> Magnus Hagander wrote:
>> Andrew Chernow wrote:
>>> WSACleanup is not really needed during a PQfinish.  Its horribly slow if
>>>  the library ref count is 0 and it actually unloads the winsock library,
>>> adds 225ms to PQfinish.
>>>
>>> Solution:
>>> A) Call WSAStartup once and never clean it up.  When the app dies, so do
>>> the ref counts and winsock is automatically unloaded.
>>>
>>> B) Have a way of specifying the behavior, the way it is now or tell
>>> libpq to not initialize wsa at all (kinda like ssl init callbacks).
>>> Leave it up to the application.
>>>
>>> I think the WSA startup/cleanup stuff is silly.  If I dynamically link
>>> with a DLL, I want it automatically loaded and cleaned up.
>>>
>>> Worst case, your app makes lots of connections to different backends.
>>> So, it is constantly doing PQconnectdb and PQfinish; only has a single
>>> conn open at a time.  This means its constantly loading and unloading
>>> winsock.
>>
>> Option A will make us leak the reference to it though, won't it? And we
>> are supposed to clean up after ourselves...
>>
> 
> Personally, I don't think its the job of libpq to call wsa startup or
> shutdown.  Pulling it out now will surely break existing apps and piss

I'm afraid it is. Looking at the API docs, the very first paragraph says:
"The WSAStartup function must be the first Windows Sockets function
called by an application or DLL. It allows an application or DLL to
specify the version of Windows Sockets required and retrieve details of
the specific Windows Sockets implementation. The application or DLL can
only issue further Windows Sockets functions after successfully calling
WSAStartup."


Simply put: The API requires we do it.


> people off, so I don't think this is an option.  If anything, it should
> not be a per conn thing.  Its really a library wide thing.  Think about
> it, there is no gain in doing this per conn.  Not to mention, I just
> found a major issue with it.

That is true, however. I'm not sure I'll agree it's a major issue, but
it's certainly sub-optimal.

The use-case of rapidly creating and dropping connections isn't
particularly common, I think. And there is a perfectly functioning
workaround - something that we should perhaps document in the FAQ or
somewhere in the documentation?


>> Now, if we actually had libpq_init()/uninit() or something like it, it
>> would make sense to move it there. But I'm not sure we want to just leak
>> the reference. But I'm not entirely convinced either way :-)
>>
> 
> There is probably someone out there that wants wsa to completely unload
> when there done using libpq (maybe its a long-lived app like an NT
> service).  The only thing a leaked ref would do is never unload wsa
> until the app exited.  So, this is probably bad behavior.

No, it can also hold internal WSA references and structures.


> libpq_init() is where an app should elect what libpq should load.  If
> init is never called, everything should work as it does now.  Something
> like that. openssl init stuff should be controlled by the same function,
> maybe some other components.  Auto-init'n is a nice feature most of the
> time, but it suffers from ASSuming how and when an app wants to perfom
> the init.

Yeah. So the question is, do we want to bite the bullet and create
init() and uninit() functions for libpq? It'll be a new version of
libpq, and apps will require the new version in order to use it, so it's
a fairly large change to the API.. However, having *had* one, would
certainly have made the SSL fixes that Bruce put in earlier a lot
simpler

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Magnus Hagander
Tom Lane wrote:
> alan...@gmail.com writes:
>>> One thing to watch out for is that the intention may have been to allow
>>> the strings to be translated.
> 
>> I'm not sure if that's the case. How does one find out?
> 
> If the origin of the "variable" format is a constant or set of constants
> decorated with gettext_noop(), then this type of edit will have defeated
> the intended localization.  In the cases at hand, I believe that all but
> one of your proposed patches break the desired behavior.
> 
> What's worse, I see that Magnus got there before you, and has broken
> localization here and in several other places:
> http://archives.postgresql.org/pgsql-committers/2008-11/msg00264.php

> Magnus, you wanna clean up the mess?  

Crap. Yeah, I'll try to get around to that soon. No time tonight though.

> And what patch does the "few more"
> comment refer back to?

I think it refers to this:
http://archives.postgresql.org/pgsql-committers/2008-11/msg00249.php

Initially it came out of this thread:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01348.php

If my memory is correct, there shouldn't be more than those two patches.


> A workable solution that both silences the warning and preserves
> localizability is to follow a coding pattern like this:
> 
>   const char *mymsg = gettext_noop("Some text to be localized.");
> 
>   ...
> 
>   errmsg("%s", _(mymsg))  // not just errmsg(mymsg)
> 
> I would recommend that we do this, because otherwise we are certainly
> going to have more breakage from well-intentioned patchers, whatever
> Peter's opinion of the merits of the compiler warning might be ;-)

Seems reasonable.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Gregory Stark
Tom Lane  writes:

> The really nasty cases are like this:
>
>   const char *myfmt = gettext_noop("Some bleat about object \"%s\".");
>
>   ...
>
>   errmsg(myfmt, objectname)
>
> where there really is no simple way to convince the compiler that you
> know what you're doing without breaking functionality.  This is probably
> why -Wformat-security doesn't warn about the latter type of usage.  It
> does kind of beg the question of why bother with that warning though ...

It makes sense to me: if you have arguments for the format string then
presumably you've at some point had to check that the format string has
escapes for those arguments.

The only danger in the coding style comes from the possibility that there are
escapes you didn't anticipate. It's a lot harder to expect specific non-zero
escapes and find something else than to just not think about it at all and
unknowingly depend on having no escapes.

And it would take willful ignorance to depend on having some specific set of
escapes in an unchecked string provided by an external data source, which is
where the worst danger lies.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Tom Lane
alan...@gmail.com writes:
>> One thing to watch out for is that the intention may have been to allow
>> the strings to be translated.

> I'm not sure if that's the case. How does one find out?

If the origin of the "variable" format is a constant or set of constants
decorated with gettext_noop(), then this type of edit will have defeated
the intended localization.  In the cases at hand, I believe that all but
one of your proposed patches break the desired behavior.

What's worse, I see that Magnus got there before you, and has broken
localization here and in several other places:
http://archives.postgresql.org/pgsql-committers/2008-11/msg00264.php

Magnus, you wanna clean up the mess?  And what patch does the "few more"
comment refer back to?

A workable solution that both silences the warning and preserves
localizability is to follow a coding pattern like this:

const char *mymsg = gettext_noop("Some text to be localized.");

...

errmsg("%s", _(mymsg))  // not just errmsg(mymsg)

I would recommend that we do this, because otherwise we are certainly
going to have more breakage from well-intentioned patchers, whatever
Peter's opinion of the merits of the compiler warning might be ;-)

The really nasty cases are like this:

const char *myfmt = gettext_noop("Some bleat about object \"%s\".");

...

errmsg(myfmt, objectname)

where there really is no simple way to convince the compiler that you
know what you're doing without breaking functionality.  This is probably
why -Wformat-security doesn't warn about the latter type of usage.  It
does kind of beg the question of why bother with that warning though ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Heikki Linnakangas

Grzegorz Jaskiewicz wrote:

On 2009-01-18, at 09:56, Peter Eisentraut wrote:

-Wformat-security warns about

   printf(var);

but not about

   printf(var, a);

I don't understand that; the crash or exploit potential is pretty much 
the

same in both cases.
not at all. First case allows you to pass in var from outside, with 
your, well crafted format strings. Please read more about subject, 
before you say something that silly.


The point is that if "var" comes from an untrusted source, both forms 
are just as dangerous.


I guess that in practice, the first form is more likely to be an oversight.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VARSIZE - why omit VARLEN?

2009-01-18 Thread Andrew Chernow

Tom Lane wrote:

Andrew Chernow  writes:

Any objections adding the below to postgres.h?
#define VARLEN(PTR) (VARSIZE(PTR) - VARHDRSZ)


For the most part, you should probably be using VARSIZE_ANY_EXHDR
anyplace that that might be a good idea.

regards, tom lane




Thanks, that will do it.  I didn't know that macro existed.  I missed its 
addition in 8.3.  sorry.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)

2009-01-18 Thread Brendan Jurd
On Sat, Sep 27, 2008 at 4:25 AM, Brendan Jurd  wrote:
> Currently, Postgres accepts four separate flavours for specifying
> meridiem markers, given by uppercase/lowercase and with/without
> periods:
>
>  * am/pm
>  * AM/PM
>  * a.m./p.m.
>  * A.M./P.M.

>
> I would go so far as to say that we should accept any of the 8 valid
> meridiem markers, regardless of which flavour is indicated by the
> formatting keyword.
>
> Day and month names already work this way.  We don't throw an error if
> a user specifies a mixed-case month name like "Sep" but uses the
> uppercase formatting keyword "MON".

I've been thinking further about this lately, and whilst the month and
day name tokens aren't fussy about *case*, they do make a distinction
about *length*.

So, while MON will match "Sep", "SEP" and "sep" just fine, it will
have issues with "September" (it will match the first three characters
as "Sep" and then leave the remaining characters "tember" to bork up
the next token).

Likewise, MONTH will not match "Sep", it needs the full month name.

I think, for to_timestamp(), it's important that the user have a solid
idea of how many characters each formatting token wants to consume.
With the am/pm and bc/ad markers, we've got two possibilities for
length; without periods (2 characters) and with periods (4
characters).  Having the 2-character token match against a 4-character
string might cause more confusion than convenience.

It may make more sense to keep the different lengths separate, so that
a 2-character token will match any of "am", "pm", "AM", "PM", and a
4-character token will match any of "a.m.", "p.m.", "A.M.", "P.M.".

Comments?

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread alanwli

On Jan 17, 2009 3:34pm, Peter Eisentraut  wrote:

On Saturday 17 January 2009 11:44:07 Alan Li wrote:

> Attached are patches to fix the following compiler warnings that I see  

when


> using gcc 4.3.2.

>

> MASTER warning:

> tablecmds.c: In function 'DropErrorMsgWrongType':

> tablecmds.c:601: warning: format not a string literal and no format

> arguments

>

> REL8_3_STABLE warnings:

> utility.c: In function 'DropErrorMsgWrongType':

> utility.c:129: warning: format not a string literal and no format  

arguments


> trigger.c: In function 'ConvertTriggerToFK':

> trigger.c:600: warning: format not a string literal and no format  

arguments


> trigger.c:616: warning: format not a string literal and no format  

arguments


> trigger.c:628: warning: format not a string literal and no format  

arguments


> guc.c: In function 'set_config_option':

> guc.c:4424: warning: format not a string literal and no format arguments

> describe.c: In function 'describeOneTableDetails':

> describe.c:1294: warning: format not a string literal and no format

> arguments



You apparently have your compiler configured with -Wformat-security. Our  

code


doesn't do that. I think the cases the warning complains about are fine  

and


the way the warning is designed is a bit bogus.



Yeah, you're right. I'm using gcc 4.3.2 on Ubuntu 8.10, which uses  
-Wformat-security by default.


Alan


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Grzegorz Jaskiewicz


On 2009-01-18, at 09:56, Peter Eisentraut wrote:


On Sunday 18 January 2009 08:28:51 Tom Lane wrote:
Yeah, the risk this is trying to guard against is variables  
containing

"%" unexpectedly.  Even if that's not possible, it requires some work
to verify and it's a bit fragile.  I didn't look at the specific  
cases

yet but in general I think this is a good policy.


-Wformat-security warns about

   printf(var);

but not about

   printf(var, a);

I don't understand that; the crash or exploit potential is pretty  
much the

same in both cases.
not at all. First case allows you to pass in var from outside, with  
your, well crafted format strings. Please read more about subject,  
before you say something that silly.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread alanwli

One thing to watch out for is that the intention may have been to allow

the strings to be translated.



regards, tom lane



I'm not sure if that's the case. How does one find out?

Alan


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Peter Eisentraut
On Sunday 18 January 2009 08:28:51 Tom Lane wrote:
> Yeah, the risk this is trying to guard against is variables containing
> "%" unexpectedly.  Even if that's not possible, it requires some work
> to verify and it's a bit fragile.  I didn't look at the specific cases
> yet but in general I think this is a good policy.

-Wformat-security warns about

printf(var);

but not about

printf(var, a);

I don't understand that; the crash or exploit potential is pretty much the 
same in both cases.

-Wformat-nonliteral warns about both cases.  We have legitimate code that 
requires this, however.

What would be helpful is a way to individually override the warning for the 
rare code where you know what you are doing.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level triggers and inheritance

2009-01-18 Thread Peter Eisentraut
On Sunday 18 January 2009 08:24:47 Tom Lane wrote:
> Peter Eisentraut  writes:
> > On Thursday 15 January 2009 02:08:42 Bruce Momjian wrote:
> >> Added to TODO:
> >> Have statement-level triggers fire for all tables in an
> >> inheritance hierarchy
> >
> > I don't think that was really the conclusion from the thread.
> >
> > As far as I can interpret the opinions, statement level triggers should
> > fire on the parent table only, rather than on some child, as it currently
> > does.
>
> I think the consensus was that each table should have its own statement
> triggers (if any) fire.  Which is one possible reading of Bruce's TODO
> item, but it's surely not clearly worded.

We should also consult the SQL standard.  Its language regarding inheritance 
is sometimes not in line with our implementation (see recent discussion about 
GRANT).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers