Re: [HACKERS] LATERAL

2009-10-18 Thread Robert Haas
On Sat, Oct 17, 2009 at 10:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 That still leaves a lot of silly paths, though.  In many cases, if
 you're thinking about joining A to {B C} using an index-accelerated
 path, you'd be just as well off joining to B first and then to C.  So
 it might be that we only need to consider index-accelerated paths when
 there is no legal join between the LHS and a subset of the RHS.

 Yeah.  If there are no join order constraints, it's always possible to
 form a plan such that you join a given rel only once you have all the
 rels needed (to provide values for the inner indexscan) on the lefthand
 side.  I think this is probably the main reason why the issue was not
 treated in the planner originally.  So maybe the way to think about
 this is as a way of dealing with join order constraints without losing
 the benefits of inner indexscans.

 The other problem I see here is that the bottom-up approach that we
 use in general is going to be difficult to apply here, because the set
 of paths will vary depending on what parameters are pushed down from
 the outer side.

 Well, we deal with that already --- the set of possible inner indexscan
 paths already varies depending on what the LHS is.  I think the point
 here is that we'd be considering an inner path that's against an LHS
 that it's not legal to join the inner rel to *directly*.  Such a path
 would only be legal if we later join to that LHS at a higher join level.
 So we'd be keeping around some tentative paths that might not ever form
 a valid join plan.

 Maybe we should turn around the way that inner indexscan paths are
 made.  Currently we form them on-the-fly while considering a valid
 join combination.  Maybe we should build them all at the first level
 (driving this off the set of available join clauses for each base rel)
 and mark each such path as requires a join to this other set of rels
 to be valid.  But then we'd go ahead and join such paths to *other*
 rels, keeping the resulting join paths still marked as requiring the
 same future join.  Once that join actually happens, the resulting path
 becomes fully valid.  Only a join to a proper subset of the future-join
 requirement would be disallowed meanwhile.

 I'm not even sure this would be slower or more complicated than what we
 do now --- if you look at the logic that caches potential inner
 indexscan plans, it's almost doing this already.  It would result in
 considering more join paths, but only ones that have some plausible use.

Wow, that's pretty sneaky.  I had to read this email twice before I
understood what you were proposing.  It sounds to me like it will
work.  I'm not 100% sure what the impact on planner performance will
be, but it's at least plausible that it will be OK.

I think you should only ever join an incomplete inner-indexscan path
to (1) another inner-indexscan path or (2) the cheapest total path for
*exactly* the future-join requirement.  Anything else doesn't seem
productive.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL

2009-10-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I think you should only ever join an incomplete inner-indexscan path
 to (1) another inner-indexscan path or (2) the cheapest total path for
 *exactly* the future-join requirement.  Anything else doesn't seem
 productive.

I don't see any reason to constrain the form of joins before the
future-join requirement.  Once you get to the join that satisfies
that requirement, obviously you must implement it as a nestloop
with the inner-indexscan path on the inside.  But there shouldn't
be any more constraint beyond that one for that join, either:
* you might want a fast-start plan not cheapest total
* sort orderings of the outer path might be useful

We know that sort ordering of the inner indexscan or its joins
won't be useful once we reach the future-join level, so it might
be possible to discard paths that are not cheapest total cost
below that level.  The problem is to distinguish sort orderings
that are useful within joins below that level.  You could avoid that
if you could convince yourself that there's no point in ever doing
a mergejoin at such a level, but I don't think I believe that.

It may well be that there's no point in being too tense about this issue
anyway.  The planner will only consider early joins to rels that have
join clauses to the rel with the inner-indexscan path.  There probably
won't be that many of them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rejecting weak passwords

2009-10-18 Thread Ron Mayer
Bruce Momjian wrote:
 Yep, this is illustrating something that is pretty basic to open source
 --- that is open source often provides the tools for a solution, rather
 than a complete solution.  I often think of open source as providing a
 calculator with wires sticking out, rather than calculator buttons;  the
 wires allow more flexibility, but they are harder to use.

I disagree.Open source typically provides the complete solution
too - just not from the developer who programs one component of the
solution.

Checklist writers intentionally use this to make straw-man arguments.
People used to say linux doesn't even have a GUI - noting that
X11 is a separate project.

Now people have database checkboxes for:

 * a GUI admin tool (which we have, though it's a separate package)
 * GIS data types (which we have, though it's a separate package)
 * server-side password filters (which we have, though LDAP, etc)
 * replication (which we have, though many packages)
 * clustering (which we have, though hadoopdb)

The Linux guys successfully communicated that it isn't fair
for checklists to compare an OS kernel against commercial
application suites.   Seems it'd be good for the postgres
project to similarly communicate that the database kernel
is the core of a platform that's broader than just a
database kernel.

   Ron


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Controlling changes in plpgsql variable resolution

2009-10-18 Thread Tom Lane
As most of you will recall, plpgsql currently acts as though identifiers
in SQL queries should be resolved first as plpgsql variable names, and
only failing that do they get processed as names of the query.  The
plpgsql parser rewrite that I'm working on will fix that for the
obviously-silly cases where a plpgsql variable is substituted for a
table name or some other non-scalar-variable identifier.  However, what
should we do when a name could represent either a plpgsql variable
or a column of the query?  Historically we've resolved it as the
plpgsql variable, but we've sure heard a lot of complaints about that.
Oracle's PL/SQL has the precedence the other way around: resolve first
as the query column, and only failing that as a PL variable.  The Oracle
behavior is arguably less surprising because the query-provided names
belong to the nearer enclosing scope.  I believe that we ought to move
to the Oracle behavior over time, but how do we get there from here?
Changing it is almost surely going to break a lot of people's functions,
and in rather subtle ways.

I think there are basically three behaviors that we could offer:

1. Resolve ambiguous names as plpgsql (historical PG behavior)
2. Resolve ambiguous names as query column (Oracle behavior)
3. Throw error if name is ambiguous (useful for finding problems)

(Another possibility is to throw a warning but proceed anyway.  It would
be easy to do that if we proceed with the Oracle behavior, but *not*
easy if we proceed with the historical PG behavior.  The reason is that
the code invoked by transformColumnRef may have already made some
side-effects on the query tree.  We discussed the implicit-RTE behavior
yesterday, but there are other effects of a successful name lookup,
such as marking columns for privilege checking.)

What I'm wondering about at the moment is which behaviors to offer and
how to control them.  The obvious answer is use a GUC but that answer
scares me because of the ease with which switching between #1 and #2
would break plpgsql functions.  It's not out of the question that that
could even amount to a security problem.  I could see using a GUC to
turn the error behavior (#3) on and off, but not to switch between #1
and #2.

Another possibility is to control it on a per-function basis by adding
some special syntax to plpgsql function bodies to say which behavior
to use.  We could for instance extend the never-documented #option
syntax.  This is pretty ugly and would be inconvenient to use too
--- if people have to go and add #option something to a function,
they might as well just fix whatever name conflicts it has instead.

I'm not seeing any choice that seems likely to make everybody happy.
Any comments or ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Boosting cost estimates for some built-in functions

2009-10-18 Thread Tom Lane
We recently saw a complaint about psql \d commands being quite slow
with many tables, which turned out to be because the planner was
putting a table_has_privilege() call where it would get executed a
lot, before other cheaper tests on pg_class.  This is not the
planner's fault --- it has no information that would suggest that
table_has_privilege() is any more expensive than any other built-in
function.

In another case I was looking at just now, it seems that to_tsquery()
and to_tsvector() are noticeably slower than most other built-in
functions, which is not surprising given the amount of mechanism that
gets invoked inside them.  It would be useful to tell the planner
about that to discourage it from picking seqscan plans that involve
repeated execution of these functions.

I'd like to suggest boosting the built-in cost estimates for the
xxx_has_privilege functions to perhaps 10.  to_tsquery and to_tsvector
maybe should be boosted even higher, but I don't have a good specific
number in mind.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL

2009-10-18 Thread Robert Haas
On Sun, Oct 18, 2009 at 11:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I think you should only ever join an incomplete inner-indexscan path
 to (1) another inner-indexscan path or (2) the cheapest total path for
 *exactly* the future-join requirement.  Anything else doesn't seem
 productive.

 I don't see any reason to constrain the form of joins before the
 future-join requirement.  Once you get to the join that satisfies
 that requirement, obviously you must implement it as a nestloop
 with the inner-indexscan path on the inside.  But there shouldn't
 be any more constraint beyond that one for that join, either:
        * you might want a fast-start plan not cheapest total
        * sort orderings of the outer path might be useful

 We know that sort ordering of the inner indexscan or its joins
 won't be useful once we reach the future-join level, so it might
 be possible to discard paths that are not cheapest total cost
 below that level.

Yeah, this was what I was driving at, but I got turned around in my
head and was explaining it incorrectly.

 The problem is to distinguish sort orderings
 that are useful within joins below that level.  You could avoid that
 if you could convince yourself that there's no point in ever doing
 a mergejoin at such a level, but I don't think I believe that.

 It may well be that there's no point in being too tense about this issue
 anyway.  The planner will only consider early joins to rels that have
 join clauses to the rel with the inner-indexscan path.  There probably
 won't be that many of them.

You could probably convince me that a merge join is not going to be
too useful (how often can you want a merge join on the inner side of a
nested loop?  ... especially when there are partial index scans
involved) but I think your last point here is well-taken.  We've
certainly turned a corner from it's just the exponential growth in
the number of join paths to consider that's a problem.  :-)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Boosting cost estimates for some built-in functions

2009-10-18 Thread Simon Riggs
On Sun, 2009-10-18 at 14:50 -0400, Tom Lane wrote:

 I'd like to suggest boosting the built-in cost estimates for the
 xxx_has_privilege functions to perhaps 10.  to_tsquery and to_tsvector
 maybe should be boosted even higher, but I don't have a good specific
 number in mind.

ISTM we should set up a test that measures the cost, of those and other
functions.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Boosting cost estimates for some built-in functions

2009-10-18 Thread Robert Haas
On Sun, Oct 18, 2009 at 3:38 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Sun, 2009-10-18 at 14:50 -0400, Tom Lane wrote:

 I'd like to suggest boosting the built-in cost estimates for the
 xxx_has_privilege functions to perhaps 10.  to_tsquery and to_tsvector
 maybe should be boosted even higher, but I don't have a good specific
 number in mind.

 ISTM we should set up a test that measures the cost, of those and other
 functions.

Benchmarking???  Heresy!

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Controlling changes in plpgsql variable resolution

2009-10-18 Thread Simon Riggs
On Sun, 2009-10-18 at 13:25 -0400, Tom Lane wrote:

 As most of you will recall, plpgsql currently acts as though identifiers
 in SQL queries should be resolved first as plpgsql variable names, and
 only failing that do they get processed as names of the query.  The
 plpgsql parser rewrite that I'm working on will fix that for the
 obviously-silly cases where a plpgsql variable is substituted for a
 table name or some other non-scalar-variable identifier.  However, what
 should we do when a name could represent either a plpgsql variable
 or a column of the query?  Historically we've resolved it as the
 plpgsql variable, but we've sure heard a lot of complaints about that.
 Oracle's PL/SQL has the precedence the other way around: resolve first
 as the query column, and only failing that as a PL variable.  The Oracle
 behavior is arguably less surprising because the query-provided names
 belong to the nearer enclosing scope.  I believe that we ought to move
 to the Oracle behavior over time, but how do we get there from here?
 Changing it is almost surely going to break a lot of people's functions,
 and in rather subtle ways.
 
 I think there are basically three behaviors that we could offer:
 
 1. Resolve ambiguous names as plpgsql (historical PG behavior)
 2. Resolve ambiguous names as query column (Oracle behavior)
 3. Throw error if name is ambiguous (useful for finding problems)
 
 (Another possibility is to throw a warning but proceed anyway.  It would
 be easy to do that if we proceed with the Oracle behavior, but *not*
 easy if we proceed with the historical PG behavior.  The reason is that
 the code invoked by transformColumnRef may have already made some
 side-effects on the query tree.  We discussed the implicit-RTE behavior
 yesterday, but there are other effects of a successful name lookup,
 such as marking columns for privilege checking.)
 
 What I'm wondering about at the moment is which behaviors to offer and
 how to control them.  The obvious answer is use a GUC but that answer
 scares me because of the ease with which switching between #1 and #2
 would break plpgsql functions.  It's not out of the question that that
 could even amount to a security problem.  I could see using a GUC to
 turn the error behavior (#3) on and off, but not to switch between #1
 and #2.
 
 Another possibility is to control it on a per-function basis by adding
 some special syntax to plpgsql function bodies to say which behavior
 to use.  We could for instance extend the never-documented #option
 syntax.  This is pretty ugly and would be inconvenient to use too
 --- if people have to go and add #option something to a function,
 they might as well just fix whatever name conflicts it has instead.

I'd suggest two options, one for name resolution (#1 or #2) and one for
error level of ambiguity (none or ERROR).

GUCs are fine, now we have GUC settings per-function.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Controlling changes in plpgsql variable resolution

2009-10-18 Thread Robert Haas
On Sun, Oct 18, 2009 at 1:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 As most of you will recall, plpgsql currently acts as though identifiers
 in SQL queries should be resolved first as plpgsql variable names, and
 only failing that do they get processed as names of the query.  The
 plpgsql parser rewrite that I'm working on will fix that for the
 obviously-silly cases where a plpgsql variable is substituted for a
 table name or some other non-scalar-variable identifier.  However, what
 should we do when a name could represent either a plpgsql variable
 or a column of the query?  Historically we've resolved it as the
 plpgsql variable, but we've sure heard a lot of complaints about that.
 Oracle's PL/SQL has the precedence the other way around: resolve first
 as the query column, and only failing that as a PL variable.  The Oracle
 behavior is arguably less surprising because the query-provided names
 belong to the nearer enclosing scope.  I believe that we ought to move
 to the Oracle behavior over time, but how do we get there from here?
 Changing it is almost surely going to break a lot of people's functions,
 and in rather subtle ways.

 I think there are basically three behaviors that we could offer:

 1. Resolve ambiguous names as plpgsql (historical PG behavior)
 2. Resolve ambiguous names as query column (Oracle behavior)
 3. Throw error if name is ambiguous (useful for finding problems)

 (Another possibility is to throw a warning but proceed anyway.  It would
 be easy to do that if we proceed with the Oracle behavior, but *not*
 easy if we proceed with the historical PG behavior.  The reason is that
 the code invoked by transformColumnRef may have already made some
 side-effects on the query tree.  We discussed the implicit-RTE behavior
 yesterday, but there are other effects of a successful name lookup,
 such as marking columns for privilege checking.)

 What I'm wondering about at the moment is which behaviors to offer and
 how to control them.  The obvious answer is use a GUC but that answer
 scares me because of the ease with which switching between #1 and #2
 would break plpgsql functions.  It's not out of the question that that
 could even amount to a security problem.  I could see using a GUC to
 turn the error behavior (#3) on and off, but not to switch between #1
 and #2.

 Another possibility is to control it on a per-function basis by adding
 some special syntax to plpgsql function bodies to say which behavior
 to use.  We could for instance extend the never-documented #option
 syntax.  This is pretty ugly and would be inconvenient to use too
 --- if people have to go and add #option something to a function,
 they might as well just fix whatever name conflicts it has instead.

 I'm not seeing any choice that seems likely to make everybody happy.
 Any comments or ideas?

If we just change the default behavior from #1 to #2, it's going to be
insanely easy to dump a database using pg_dump for 8.4, restore into
an 8.5 database, and end up with a function that does something
different and broken.  So I'm opposed to that plan, but amenable to
any of the other options in varying degrees.

I think it would make a fair amount of sense to make #3 the default behavior.

If possible, I think we should try to engineer things so that using
pg_dump 8.5 on an 8.4 database and restoring the result into an 8.5
database produces a function with identical semantics.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL

2009-10-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 You could probably convince me that a merge join is not going to be
 too useful (how often can you want a merge join on the inner side of a
 nested loop?

Why not?  As Andrew pointed out, what we're really trying to accomplish
here is consider sub-join plans that are parameterized by a value
obtained from an outer relation.  I think we shouldn't artificially
limit what we consider.

But anyway I think we're on the same page here: what we ought to do is
try implementing this scheme without any extra restrictions on what it
considers, and see what the performance is like.  We can try to limit
what it considers if it turns out not to work well in the simplest
form.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL

2009-10-18 Thread Robert Haas
On Sun, Oct 18, 2009 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 You could probably convince me that a merge join is not going to be
 too useful (how often can you want a merge join on the inner side of a
 nested loop?

 Why not?  As Andrew pointed out, what we're really trying to accomplish
 here is consider sub-join plans that are parameterized by a value
 obtained from an outer relation.  I think we shouldn't artificially
 limit what we consider.

 But anyway I think we're on the same page here: what we ought to do is
 try implementing this scheme without any extra restrictions on what it
 considers, and see what the performance is like.  We can try to limit
 what it considers if it turns out not to work well in the simplest
 form.

And then when we get done we can consider implementing $SUBJECT, which
is no longer what this thread is about at all.  :-)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Controlling changes in plpgsql variable resolution

2009-10-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 If possible, I think we should try to engineer things so that using
 pg_dump 8.5 on an 8.4 database and restoring the result into an 8.5
 database produces a function with identical semantics.

Hmm ... actually, we could have pg_dump stick either a #option line
or a GUC SET parameter onto every plpgsql function it pulls from an
old database.  So if you're willing to assume that people do their
upgrades that way, it could be made reasonably safe, even if the
default behavior changes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Boosting cost estimates for some built-in functions

2009-10-18 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Sun, 2009-10-18 at 14:50 -0400, Tom Lane wrote:
 I'd like to suggest boosting the built-in cost estimates for the
 xxx_has_privilege functions to perhaps 10.  to_tsquery and to_tsvector
 maybe should be boosted even higher, but I don't have a good specific
 number in mind.

 ISTM we should set up a test that measures the cost, of those and other
 functions.

Well, some quick testing suggests that 10x is in the right ballpark for
has_table_privilege versus a simple function such as OID equality,
and that even fairly simple cases of to_tsvector/to_tsquery can be
100x the cost of OID equality.  I'm not sure if there is a typical
case for the latter given the wide range of possible tsearch
configurations, so trying to benchmark it with any degree of precision
might be pointless.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Controlling changes in plpgsql variable resolution

2009-10-18 Thread Robert Haas
On Sun, Oct 18, 2009 at 4:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 If possible, I think we should try to engineer things so that using
 pg_dump 8.5 on an 8.4 database and restoring the result into an 8.5
 database produces a function with identical semantics.

 Hmm ... actually, we could have pg_dump stick either a #option line
 or a GUC SET parameter onto every plpgsql function it pulls from an
 old database.  So if you're willing to assume that people do their
 upgrades that way, it could be made reasonably safe, even if the
 default behavior changes.

I'm not completely willing to assume that.  I know we recommend it,
but I'm sure that people don't always do it.  And this is pretty
subtle: someone might screw it up and get a non-working function
without realizing it.  The advantage of making the default behavior
throw an error is that it should be pretty obvious if you've broken
something.

And this isn't just about pg_dump, either.  I have a script that gets
run regularly on one of my production databases that goes an updates
the definitions of a whole bunch of functions to the latest version
from the source code repository.  Even if pg_dump DTRT, the next run
of a script of that type might subtly break a bunch of stuff.

The current behavior is a trap for the unwary, so I have no problem
with changing it.  But I think we should try really hard to avoid
creating a more subtle trap in the process.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres 8.3.8 and Solaris 10_x86 64 bit problems?

2009-10-18 Thread u235sentinel
Are you sure about this?  When I try to build and don't have openssl in 
the lib/include path it claims it needs it.  As I'm building 64 bit I 
can now build postgres in 64 bit with openssl 98k just fine.  However 
when I run it I'm getting the same error message.


I'm curious if this is a lost hope.  My boss is recommending we flatten 
the Sun box and install redhat linux (which I'm fine with).  I'd rather 
not as threading in Solaris is better.


Thoughts?

thanks


Zdenek Kotala wrote:
You can look on 
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=ghost_mothdt=2009-10-07%2021:06:00 



How it is built. You also does not needed own version of Openssl. All 
security fixes are backported.  It is located in /usr/sfw/lib or 
/usr/sfw/lib/64


Sometimes are problem with gcc and solaris linker. IIRC, I had problem 
with PLPerl compilation.


Zdenek

Dne  8.10.09 03:48, u235sentinel napsal(a):

So I compiled postgres with Solaris 10 and have problems running it.

# ./pg_ctl
ld.so.1: pg_ctl: fatal: relocation error: R_AMD64_32: file 
/usr/local/postgres64/lib/libpq.so.5: symbol (unknown): value 
0xfd7fff1cf210 does not fit

Killed

# ldd pg_ctl
  libpq.so.5 =/usr/local/postgres64/lib/libpq.so.5
  libm.so.2 = /usr/lib/64/libm.so.2
  libxml2.so.2 =  /usr/lib/64/libxml2.so.2
  libz.so.1 = /usr/lib/64/libz.so.1
  libreadline.so.6 =  /usr/local/lib/libreadline.so.6
  libcurses.so.1 =/usr/lib/64/libcurses.so.1
  librt.so.1 =/usr/lib/64/librt.so.1
  libsocket.so.1 =/usr/lib/64/libsocket.so.1
  libc.so.1 = /usr/lib/64/libc.so.1
  libpthread.so.1 =   /usr/lib/64/libpthread.so.1
  libnsl.so.1 =   /lib/64/libnsl.so.1
  libgcc_s.so.1 = /usr/sfw/lib/amd64/libgcc_s.so.1
  libaio.so.1 =   /lib/64/libaio.so.1
  libmd.so.1 =/lib/64/libmd.so.1
  libmp.so.2 =/lib/64/libmp.so.2
  libscf.so.1 =   /lib/64/libscf.so.1
  libdoor.so.1 =  /lib/64/libdoor.so.1
  libuutil.so.1 = /lib/64/libuutil.so.1
  libgen.so.1 =   /lib/64/libgen.so.1

# file /usr/local/postgres64/lib/libpq.so.5
/usr/local/postgres64/lib/libpq.so.5:   ELF 64-bit LSB dynamic lib 
AMD64 Version 1 [SSE CMOV], dynamically linked, not stripped



What am I missing???

Here's my environment.

Solaris 10 x86_64 with postgres 8.3.8 and openssl 98k using gcc 
version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)

, sunstudio12.1 and GNU Make 3.80

I've even monkied with LD_LIBRARY_PATH but getting the same issues.  
Seems when I don't compile in openssl everything is fine.


Thanks!







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres 8.3.8 and Solaris 10_x86 64 bit problems?

2009-10-18 Thread Andrew Chernow
I'm curious if this is a lost hope.  My boss is recommending we flatten 
the Sun box and install redhat linux (which I'm fine with).  I'd rather 
not as threading in Solaris is better.


Maybe solaris threads were better 10-15 years ago, but I'm not convinced that is 
still the case.  Any data supporting that argument, solaris 10 threads vs. linux 
2.6.11+ kernel (p)threads?


Another thing to consider in your decision is that Sun was just bought by 
oracle, leaving the solaris road map up in the air.  At least for me, Linux is a 
bit more reassuring ... or freebsd.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] foreign-key inference join removal

2009-10-18 Thread Robert Haas
On Thu, Sep 17, 2009 at 5:14 PM, I wrote:
 I think that the next project in this area should
 be to try to support removal of INNER joins.  (Removal of SEMI, ANTI,
 and FULL joins seems unlikely ever to be interesting.)  That will
 require teaching the planner about foreign key constraints, which
 interestingly opens up some other interesting optimization
 opportunities.  An INNER join that can never match zero rows can
 alternatively be implemented as a LEFT join, which can be reordered in
 some situations where an inner join can't.  e.g. A LJ (B IJ C ON Pbc)
 ON Pab can be implemented as (A LJ B ON Pab) LJ/IJ C ON Pbc if it is
 the case that for every row in B, there will be at least one row in C
 such that Pbc holds.

A few more thoughts on this topic.

Suppose we define a new join type called inner_or_left_join.  This
means that we've proven that every outer row has at least one join
partner, so that we'll get the same results whichever way we implement
it.  We can prove this for either inner joins or left joins, or for
right joins with the sides reversed, by checking that:

(1) The inner rel is a baserel with no restriction clauses.
(2) All the join clauses are merge-joinable.
(3) There is a table on the outer side of the join with a foreign key
constraint referencing the inner table, such that the columns of the
foreign key constraint and the chosen equality operators exactly match
up with the join clauses (no extra columns, no extra join clauses).
(4) All the relevant columns of the outer table are NOT NULL.

I am not quite sure how to check for #3 and #4, nor am I sure where
the best place to put this logic is.

Once this infrastructure is in place, it opens up a number of
interesting optimization opportunities.

1. Inner joins that are relabelled as inner_or_left joins can be
removed using the same logic that we already use for left joins.

2. Inner_or_left joins can be reordered more flexibly than either
inner joins or left joins.  Consider:

A LJ (B IJ C ON Pbc) ON Pab

If B IJ C is inner_or_left and Pbc is strict, the joins commute.
Further, I believe that if the LJ is inner_or_left and Pab is strict,
we can commute the joins provided that, in the process, we change the
join between A and B to be an inner join (no longer inner_or_left).
But if neither join is inner_or_left then no reordering is possible.

3. Some path types are implemented only for inner joins but not for
left joins; these can also be used for inner_or_left joins.  For
example, consider A LJ B ON Pab.  If the join can be proven
inner_or_left, we have the option of treating it as an inner join and
using A rather than B as the inner rel.

Thoughts?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reworks for Access Control facilities (r2363)

2009-10-18 Thread KaiGai Kohei
Heikki Linnakangas wrote:
 KaiGai Kohei wrote:
 1) creation of a database object

 In SELinux model, when a user tries to create a new object (not limited
 to database object, like a file or socket), a default security context
 is assigned on the new object, then SELinux checks whether the user has
 privileges to create a new object labeled with the security context, or not.

 When we create a new table, the default PG model checks ACL_CREATE privilege
 on the namespace which is supposed to own the new table. DefineRelation()
 invokes pg_namespace_aclcheck() with OID of the namespace, but we cannot
 see any properties of the new table from inside of pg_namespace_aclcheck().
 It checks permissions on the couple of a user and a namespace.

 On the other hand, SE-PG model follows the above principle. When we create
 a new table, SE-PG compute a default security context to be assigned on,
 then it checks the security policy whether the user is allowed to create
 a new table labeled with the context, or not.
 It checks permissions on the couple of a user and a new table itself.
 
 I don't think I buy that argument.  Can't we simply decide that in
 PostgreSQL, the granularity is different, and you can only create
 policies governing creation of objects on the basis of schema+user
 combination, not on the properties of the new object. AFAICS it wouldn't
 violate the principle of Mandatory Access Control.

No, it violates the principle.
I omitted a case for simplification of explanations.
When we create a new object, we can provide an explicit security context
to be assigned on the new object, instead of the default one.
In this case, SELinux checks privilege to create the object with the
given security context. (If it is disallowed, this creation will be
failed.)

If we check MAC permission to create a new object based on a couple
of user and schema which owns the new one, it also allows users to
create a new object with arbitrary security context, because this
check is not applied on security context of the new object itself.

It is a reason why SELinux is MAC. It never allows to create a new
object with a violated security context. The only way to control
this policy is to check privileges on the pair of user and the new
object. Thus, SELinux defines its permission to create a new object
on various kind of objects; not limited to database objects such
as files, sockets, IPC, x-window and so on.


 2) AND-condition for all the privileges

 When a certain action requires multiple permissions at one time,
 the principle of SELinux is that all the permissions have to be checked.
 If one of them is not allowed, it disallows the required action.
 In other word, all the conditions are chained by AND.

 This principle enables us to analyze the data flows between users and
 resources with the security policy, without implementation details.
 If a certain permission (e.g db_table:{select}) can override any other
 permission (e.g db_column:{select}), it also implicitly means a possibility
 of infotmation leaks/manipulations, even if the security policy said this
 user cannot read a data from the column.

 On the other hand, the default PG model allows to bypass checks on
 certain objects. For example, column-level privileges are only checked
 when a user does not have enough permissions on the target table.
 If SELECT a,b FROM t is given, pg_attribute_aclcheck() may not invoked
 when user has needed privileges on the table t.
 
 Hmm, I see. Yes, it does seem like we'd need to change such permission
 checks to accommodate both models.

I'm not clear why we need to rework the permission checks here.
DAC and MAC perform orthogonally and independently.
DAC allows to override column-level privileges by table-level privileges
according to the default PG's model. It seems to me fine.
On the other hand, MAC checks both of permissions. It is also fine.

 3) superuser is not an exception of access control.

 It is the similar issue to the 2).
 
 Yeah.
 
 The following code is a part of AlterFunctionOwner_internal().

 
 /* Superusers can always do it */
 if (!superuser())
 {
 /* Otherwise, must be owner of the existing object */
 if (!pg_proc_ownercheck(procOid, GetUserId()))
 aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_PROC,
NameStr(procForm-proname));

 /* Must be able to become new owner */
 check_is_member_of_role(GetUserId(), newOwnerId);

 /* New owner must have CREATE privilege on namespace */
 aclresult = pg_namespace_aclcheck(procForm-pronamespace,
   newOwnerId,
   ACL_CREATE);
 if (aclresult != ACLCHECK_OK)
 aclcheck_error(aclresult, ACL_KIND_NAMESPACE,
get_namespace_name(procForm-pronamespace));
 }
 

 From perspective of the default PG model, this code 

Re: [HACKERS] Reworks for Access Control facilities (r2363)

2009-10-18 Thread KaiGai Kohei
Robert Haas wrote:
 On Sat, Oct 17, 2009 at 9:53 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 This raises an important point: We need *developer documentation* on how
 to write SE-Pgsql compliant permission checks. Not only for authors of
 3rd party modules but for developers of PostgreSQL itself. Point 2)
 above needs to be emphasized, it's a big change in the way permission
 checks have to be programmed. One that I hadn't realized before. I
 haven't been paying much attention, but neither is most other
 developers, so we need clear documentation.
 
 This is a good point.  All throughout these discussions, there has
 been a concern that whatever is implemented here will be
 unmaintainable because we don't have any committers who are familiar
 with the ins and outs of SE-Linux and MAC (and not too many other
 community members interested in the topic, either).  So some developer
 documentation seems like it might help.
 
 On the other hand, KaiGai has made several attempts at documentation
 and several attempts at patches and we're not really any closer to
 having SE-PostgreSQL in core than we were a year ago.  I think that's
 partly because KaiGai tried to bite off far too much initially
 (still?), partly because of technical problems with the patches,
 partly because the intersection of people who are experts in
 PostgreSQL and people who are experts in MAC seems to be empty, and
 partly because, as much as people sorta kinda like this feature,
 nobody other than KaiGai has really been willing to step up and pour
 into this project the kind of resources that it will likely require to
 be successful.
 
 I have to admit that I'm kind of giving up hope.  We seem to be going
 in circles, and I don't think anything new is being said on this
 thread that hasn't been said before.

We may not be always able to find out the right way to the mountain summit.
Indeed, it seems that we returned to the original design which deploys
SE-PgSQL's hooks on the strategic points.
But there is a significant improvement. We learned several designs
which we already tried were on the rocky path, although they look like
an easy path at first.

I agrre to the Heikki's suggestion.
Not only user documentation, we need another documentation from the
viewpoint of developer, which describes what permissions are defined,
what is the purpose of SE-PgSQL's hooks and when/where these are called.

Thanks,

BTW, as I noted in the last message, I have to allocate my activities
to Japan Linux Symposium in this week. So, responses may delay, Sorry.

http://events.linuxfoundation.org/events/japan-linux-symposium/schedule

-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY enhancements

2009-10-18 Thread Gokulakannan Somasundaram
Actually i thought of a solution for the wrap-around sometime back. Let me
try to put my initial thoughts into it. May be it would get refined over
conversation.

Transaction wrap-around failure

Actually this problem is present even in today's transaction id scenario and
the only way we avoid is by using freezing. Can we use a similar approach?
This freezing should mean that we are freezing the sub-transaction in order
to avoid the sub-transaction wrap around failure. I think when we use a
sub-transaction, the tuple would have xmin as the sub-transaction id(correct
me, if i am wrong). If the tuple insertion becomes successful, we will make
it equal to the parent transaction id. This way, we get a chance to re-use
the sub-transaction id, previously used. This would mean accessing the tuple
again after the sub-transaction completes

On the recovery front, the freezing should get logged into redo. With this
approach, we need only one sub-transaction id for the entire copy. If insert
gets successful for a copied row, we goto the tuple again and sub-freeze a
tuple. If it gets un-successful, we rollback the sub-transaction. But for
every un-successful transaction, we need a transaction id. May be in order
to avoid this, we can have one transaction id to mark the failures and
freeze all the failed rows for that transaction id.

I am just throwing out an idea for consideration.

Thanks,
Gokul.