Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
I summarized the series of discussion at:
  http://wiki.postgresql.org/wiki/RLS#Issue:_A_leaky_VIEWs_for_RLS

Please point out, if I missed or misunderstood something.

Thanks,

(2010/06/03 11:36), KaiGai Kohei wrote:
 (2010/06/02 12:02), KaiGai Kohei wrote:
 Here's another thought.  If we're leaning toward explicit syntax to
 designate security views (and I do mean IF, since only one person has
 signed on to that, even if it is Tom Lane!), then maybe we should
 think about ripping out the logic that causes regular views to be
 evaluated using the credentials of the view owner rather than the
 person selecting from it.  A security view would still use that logic,
 plus whatever additional stuff we come up with to prevent leakage.
 Perhaps this would be viewed as a nasty backward compatibility break,
 but the upside is that we'd then be being absolutely clear that a
 non-security view isn't and can never be trusted to be a security
 barrier.  Right now we're shipping something that purports to act as a
 barrier but really doesn't.


 Sorry, should we make clear the purpose of explicit syntax for security
 views being issued now?
 In my understanding, if security views, the planner tries to checks
 privileges of the person selecting it to reference underlaying tables
 without any ereport. If violated, the planner prevents user given
 quals (perhaps FuncExpr only?) to come into inside of the join scan.
 Otherwise, if regular views, the planner works as is. Right?

 
 I'd like to check up the problem in details.
 
 We can sort out a view into three types, as follows:
 (1) A view which cannot be pulled up
 (2) A view which can be pulled up, but does not contain any joins
 (3) A view which can be pulled up, and contains joins.
 
 For the type (1), we don't need to handle anything special, because
 no external quals are unavailable to come into.
 
 For the type (2), we also don't need to handle anything special
 except for the evaluation order matter in ExecQual(), because it is
 impossible to distribute external quals into a part of relations.
 
 So, the type (3) is all we have to consider here. Right?
 
 
 Then, where should we distinguish a security view and others?
 At least, it should be decided at pull_up_subqueries() or earlier,
 because it merges subqueries into the upper query.
 In subquery_planner(), the pull_up_subqueries() is called just after
 inline_set_returning_functions() which makes RTE_FUNCTION entry flatten
 if available. It seems to me not a strange logic to check privileges
 on underlaying relations in pull_up_subqueries(), because
 inline_set_returning_functions() also checks ACL_EXECUTE here on the
 functions to be flatten.
 
 
 Then, once we can identify what is a security view and not, it shall
 be marked on somewhere. Maybe, FromExpr of the pulled-up subquery?
 
 
 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 Then, it will prevent a user defined function to come into inside of
 security views.
 
 
 At least, it seems to me reasonable to implement.
 Shall I launch to develop a proof of concept patch?
 
 Thanks,


-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-02 Thread KaiGai Kohei
(2010/06/02 12:02), KaiGai Kohei wrote:
 Here's another thought.  If we're leaning toward explicit syntax to
 designate security views (and I do mean IF, since only one person has
 signed on to that, even if it is Tom Lane!), then maybe we should
 think about ripping out the logic that causes regular views to be
 evaluated using the credentials of the view owner rather than the
 person selecting from it.  A security view would still use that logic,
 plus whatever additional stuff we come up with to prevent leakage.
 Perhaps this would be viewed as a nasty backward compatibility break,
 but the upside is that we'd then be being absolutely clear that a
 non-security view isn't and can never be trusted to be a security
 barrier.  Right now we're shipping something that purports to act as a
 barrier but really doesn't.

 
 Sorry, should we make clear the purpose of explicit syntax for security
 views being issued now?
 In my understanding, if security views, the planner tries to checks
 privileges of the person selecting it to reference underlaying tables
 without any ereport. If violated, the planner prevents user given
 quals (perhaps FuncExpr only?) to come into inside of the join scan.
 Otherwise, if regular views, the planner works as is. Right?
 

I'd like to check up the problem in details.

We can sort out a view into three types, as follows:
(1) A view which cannot be pulled up
(2) A view which can be pulled up, but does not contain any joins
(3) A view which can be pulled up, and contains joins.

For the type (1), we don't need to handle anything special, because
no external quals are unavailable to come into.

For the type (2), we also don't need to handle anything special
except for the evaluation order matter in ExecQual(), because it is
impossible to distribute external quals into a part of relations.

So, the type (3) is all we have to consider here. Right?


Then, where should we distinguish a security view and others?
At least, it should be decided at pull_up_subqueries() or earlier,
because it merges subqueries into the upper query.
In subquery_planner(), the pull_up_subqueries() is called just after
inline_set_returning_functions() which makes RTE_FUNCTION entry flatten
if available. It seems to me not a strange logic to check privileges
on underlaying relations in pull_up_subqueries(), because
inline_set_returning_functions() also checks ACL_EXECUTE here on the
functions to be flatten.


Then, once we can identify what is a security view and not, it shall
be marked on somewhere. Maybe, FromExpr of the pulled-up subquery?


In my current idea, when a qual-node that contains FuncExpr tries to
reference a part of relations within a security view, its referencing
relations will be expanded to whole of the security view at
distribute_qual_to_rels().
Then, it will prevent a user defined function to come into inside of
security views.


At least, it seems to me reasonable to implement.
Shall I launch to develop a proof of concept patch?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread KaiGai Kohei
As it was reported before, we have an open item about leaky VIEWs for RLS.

On the talk at Ottawa, Robert suggested me to post my idea prior to submit
a patch. So, I'd like to explain my idea at first.
Actually I'm not familiar to optimizar details, so it needs any helps from
experts of optimizar.


The problem was ...

  * Using views for row-level access control is leaky
  http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php

Even if a table is unvisible from certain users without views that filter
a part of tuples, it can leak to users as long as they can define their own
functions.

It seems to me the problem can be divided into two major parts.

See the following sample tables, views and functions.

  postgres=# CREATE TABLE t1 (a int primary key, b text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey for 
table t1
  CREATE TABLE
  postgres=# CREATE TABLE t2 (x int primary key, y text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t2_pkey for 
table t2
  CREATE TABLE
  postgres=# INSERT INTO t1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
  INSERT 0 3
  postgres=# INSERT INTO t2 VALUES (1, 'xxx'), (2, 'yyy'), (3, 'zzz');
  INSERT 0 3

  -- We assume the security policy function needs the given integer key
  -- is odd number to be visible for users.
  --
  postgres=# CREATE OR REPLACE FUNCTION f_policy(int) RETURNS bool
 AS 'BEGIN RETURN $1 % 2 = 1; END' LANGUAGE plpgsql;
  CREATE FUNCTION

  -- We assume a malicious user defined function raises a notice with
  -- given arguments. It may be possible to insert it other temp tables.
  --
  postgres=# CREATE OR REPLACE FUNCTION f_malicious(text) RETURNS bool COST 
0.0001
 AS 'BEGIN RAISE NOTICE ''f_malicious: %'', $1; RETURN true; 
END;' LANGUAGE plpgsql;
  CREATE FUNCTION

[1] The order of scan filters to be evaluated
--
The first problem is an inversion of evaluation of scan filters.

  postgres=# CREATE OR REPLACE VIEW v1 AS SELECT * FROM t1 WHERE f_policy(a);
  CREATE VIEW

  postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
QUERY PLAN
  ---
   Seq Scan on t1  (cost=0.00..329.80 rows=137 width=36)
 Filter: (f_malicious(b) AND f_policy(a))
  (2 rows)

  postgres=# SELECT * FROM v1 WHERE f_malicious(b);
  NOTICE:  f_malicious: aaa
  NOTICE:  f_malicious: bbb  -- leaky contents
  NOTICE:  f_malicious: ccc
   a |  b
  ---+-
   1 | aaa
   3 | ccc
  (2 rows)

In this case, owner of the view expects tuples within t1 shall be filtered
by f_policy() functions, so tuples with even-number shall be invisible.
However, the optimizar reorders evaluation of scan filters based on the cost
parameter of functions and others, then f_malicious() was invoked prior to
f_policy(). It is a right approach, if functions are not malicious.
But user may define a malicious purpose function.

The given query is internally rewritten, then subquery will be pulled up
in the optimizar logic.

  SELECT * FROM v1 WHERE f_malicious(b);
  - SELECT * FROM (SELECT * FROM t1 WHERE f_policy(a)) v1 WHERE f_malicious(b)
  - SELECT * FROM t1 WHERE f_policy(a) AND f_malicious(b)

During we create a scan plan, the order_qual_clauses() computes the best
order to evaluate the given WHERE clause based on the cost estimation.
In this case, f_malicious() has very small cost, so order_qual_clauses()
decides the f_malicious() should be invoked earlier than f_policy().
In the result, ExecScan() invokes f_malicious() with contents of scanned
tuples to be invisible.

I have an idea that we add FuncExpr a new field (e.g nestlevel) to remember
where is originally put in the query, and prevent reordering over the nest
level of subqueries.
In above example, f_malicious() has nestlevel=0 because it is put on the top
level.
But f_policy() has nestlevel=1 because it is originally put on the second
level subquery. Then, the order_qual_clauses() will check nestlevel of the
scan filter prior to reorder them based on the cost estimation.
Even if we have multiple nestlevels, solution will be same. A FuncExpr with
larger nestlevel shall be invoked earlier than others.

Please note that we only focus on user defined functions.
For example, it is worth to choose index-scans instead of seq-scans, when
a user provides conditions which can be indexed, as follows:

  SELECT * FROM v1 WHERE a = 100;
  - SELECT * FROM (SELECT * FROM t1 WHERE f_policy(a)) v1 WHERE a = 100;

In this case, we should scan the t1 using index with the condition of 'a = 100'
prior to evaluation of f_policy(). Any operators eventually invokes a function
being correctly installed, but an assumption is that we can trust operators,
index access method, type input/output methods, conversions and so on, because
these features have to be installed by DBA (or initdb).


[2] Unexpected distribution of scan filter

Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Heikki Linnakangas
On 01/06/10 11:39, KaiGai Kohei wrote:
 Any operators eventually invokes a function
 being correctly installed, but an assumption is that we can trust operators,
 index access method, type input/output methods, conversions and so on, because
 these features have to be installed by DBA (or initdb).

Operators can be created by regular users.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread KaiGai Kohei
(2010/06/01 18:08), Heikki Linnakangas wrote:
 On 01/06/10 11:39, KaiGai Kohei wrote:
 Any operators eventually invokes a function
 being correctly installed, but an assumption is that we can trust operators,
 index access method, type input/output methods, conversions and so on, 
 because
 these features have to be installed by DBA (or initdb).
 
 Operators can be created by regular users.
 
Oops, I missed it. Indeed, operator function is not limited to C-language
functions, so regular users can create it.

Apart from the topic, does it seem to you reasonable direction to tackle to
the leaky VIEWs problem?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Heikki Linnakangas
On 01/06/10 13:04, KaiGai Kohei wrote:
 Oops, I missed it. Indeed, operator function is not limited to C-language
 functions, so regular users can create it.
 
 Apart from the topic, does it seem to you reasonable direction to tackle to
 the leaky VIEWs problem?

Yeah, I guess it is.

The general problem is that it seems like a nightmare to maintain this
throughout the planner. Who knows what optimizations this affects, and
do we need to hide things like row-counts in EXPLAIN output? If we try
to be very strict, we can expect a stream of CVEs and security releases
in the future while we find holes and plug them. On the other hand,
using views to restrict access to underlying tables is a very useful
feature, so I'd hate to just give up. We need to decide what level of
isolation we try to accomplish.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Florian Pflug
On Jun 1, 2010, at 10:39 , KaiGai Kohei wrote:
 I have an idea that we add FuncExpr a new field (e.g nestlevel) to remember
 where is originally put in the query, and prevent reordering over the nest
 level of subqueries.
 In above example, f_malicious() has nestlevel=0 because it is put on the top
 level.
 But f_policy() has nestlevel=1 because it is originally put on the second
 level subquery. Then, the order_qual_clauses() will check nestlevel of the
 scan filter prior to reorder them based on the cost estimation.
 Even if we have multiple nestlevels, solution will be same. A FuncExpr with
 larger nestlevel shall be invoked earlier than others.

Wouldn't the information leak go away if you stuck OFFSET 0 at the end of the 
view? IIRC, that is the semi-offical way to create barriers for subquery 
flattening and such.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Robert Haas
2010/6/1 KaiGai Kohei kai...@ak.jp.nec.com:
 I have an idea that we add FuncExpr a new field (e.g nestlevel) to remember
 where is originally put in the query, and prevent reordering over the nest
 level of subqueries.
 In above example, f_malicious() has nestlevel=0 because it is put on the top
 level.
 But f_policy() has nestlevel=1 because it is originally put on the second
 level subquery. Then, the order_qual_clauses() will check nestlevel of the
 scan filter prior to reorder them based on the cost estimation.
 Even if we have multiple nestlevels, solution will be same. A FuncExpr with
 larger nestlevel shall be invoked earlier than others.
[...]
 My idea is similar to what I proposed at [1]. It adds a new field into
 RelOptInfo (or other structure?) to remember the original nestlevel of
 the scan, then it will be compared to nestlevel of the FuncExpr.
 If nestlevel of the FuncExpr is smaller than nestlevel of the RelOptInfo,
 it prevents to distribute the FuncExpr onto the RelOptInfo, even if the
 function depends on only the relation of RelOptInfo.

Keep in mind that users who are NOT using a view as a security barrier
don't expect it to kill performance.  This approach, and particularly
the second part, about preventing quals from being pushed through
joins, has the potential to be a performance disaster.  So I think
it's absolutely critical that we don't do that except when it's
necessary to prevent a security issue.

On the technical side, I am pretty doubtful that the approach of
adding a nestlevel to FuncExpr and RelOptInfo is the right way to go.
I believe we have existing code (to handle left joins) that prevents
quals from being pushed down too far by fudging the set of relations
that are supposedly needed to evaluate the qual.  I suspect a similar
approach would work here.

I think the steps here are:

1. Decide whether the view is a security barrier (perhaps, check
whether the user has sufficient privs to execute the underlying query;
or we could add an explicit setting).  If not, stop.
2. Decide whether each qual executes potentially untrusted code (algorithm?).
3. Prevent any untrusted quals from being pushed down into view that
is a security barrier.

We should have a design for each of these before we start coding.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Greg Stark
2010/6/1 Heikki Linnakangas heikki.linnakan...@enterprisedb.com:
 On 01/06/10 11:39, KaiGai Kohei wrote:
 Any operators eventually invokes a function
 being correctly installed, but an assumption is that we can trust operators,
 index access method, type input/output methods, conversions and so on, 
 because
 these features have to be installed by DBA (or initdb).

 Operators can be created by regular users.

So I think we don't actually have to worry about operators and
functions which allow us to use an index scan. If they're used in an
index definition then the definer of those functions can see the
entire table anyways.

The only place where this matters, at least to a first degree, is on
the filter operations applied to a scan. If the view isn't owned by
the current user then all the filters of the view have to be enforced
first then the query filters.

Heikki's point is still valid though. Consider if it's not a matter of
filter ordering but rather that a filter is being pushed down inside a
join. If the join is from the view then it would be unsafe to filter
the rows before seeing which rows match the join... unless we can
prove all the rows survive... It would really suck not to do this
optimization too if for example you have a filter which filters all
but a single row and then join against a large table...


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Robert Haas
2010/6/1 Heikki Linnakangas heikki.linnakan...@enterprisedb.com:
 The general problem is that it seems like a nightmare to maintain this
 throughout the planner. Who knows what optimizations this affects, and
 do we need to hide things like row-counts in EXPLAIN output? If we try
 to be very strict, we can expect a stream of CVEs and security releases
 in the future while we find holes and plug them. On the other hand,
 using views to restrict access to underlying tables is a very useful
 feature, so I'd hate to just give up. We need to decide what level of
 isolation we try to accomplish.

I'm entirely uninspired by the idea of trying to prevent all possible
leakage of information via side-channels.  Even if we tried to
obfuscate the explain output, the user can still potentially get some
information by timing how long queries take to execute.  I think if we
have a hook function that can prevent certain users from running
EXPLAIN altogether (and I believe this may already be the case) that's
about the appropriate level of worrying about that case.  I think the
only thing that we can realistically prevent is allowing users to make
off with the actual tuples.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 Heikki's point is still valid though. Consider if it's not a matter of
 filter ordering but rather that a filter is being pushed down inside a
 join. If the join is from the view then it would be unsafe to filter
 the rows before seeing which rows match the join... unless we can
 prove all the rows survive... It would really suck not to do this
 optimization too if for example you have a filter which filters all
 but a single row and then join against a large table...

Well, more generally, any restriction whatsoever that is placed on
the current planner behavior in the name of security will result in
catastrophic performance degradation for some queries.  I agree with
Robert's nearby comments that we need to be selective about which
views we do this to and which functions we distrust.

CREATE SECURITY VIEW, anyone?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Greg Stark
Also incidentally I'm having trouble imagining a scenario where this
really matters. For it to be an issue you would have to simultaneously
have a user which can't access all the data and must go through views
which limit the data he can access -- and has privileges to issue DDL
to create functions and operators. That seems like an unlikely
combination. I've seen views used before to restrict the role accounts
used by front-end applications but those accounts have no DDL
privileges.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Stephen Frost
* Greg Stark (gsst...@mit.edu) wrote:
 Also incidentally I'm having trouble imagining a scenario where this
 really matters. For it to be an issue you would have to simultaneously
 have a user which can't access all the data and must go through views
 which limit the data he can access -- and has privileges to issue DDL
 to create functions and operators. That seems like an unlikely
 combination. I've seen views used before to restrict the role accounts
 used by front-end applications but those accounts have no DDL
 privileges.

Erm, I have to disagree with this in general..  We don't all just build
web apps.  On multi-user databases, this really isn't that uncommon.
I'm not saying it's an everyday kind of thing, but I don't think this
issue is something we can just ignore either.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Robert Haas
On Tue, Jun 1, 2010 at 10:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 CREATE SECURITY VIEW, anyone?

That may be the best approach, but I think it needs more than one line
of exposition.  The approach I proposed was to test whether the user
has privileges to execute the underlying query directly without going
through the view.  If so, we needn't be concerned.  If not, then we
start thinking about which functions/operators we trust.

The only disadvantage to that approach that I see is that it
introduces some extra permission-checking overhead.  If having an
explicit notion of which views are intended to be security views
enables us to reduce or eliminate that overhead, it may be worth
doing.  But I'm not sure that it does.  A blanket rule that says
don't push untrusted quals into security views is not going to be
too satisfactory - we probably want to do that only when the view is
queried by an untrusted user - the superuser, or someone else with
adequate permissions, will presumably still want his quals to get
pushed down.

Perhaps there is some value to having a knob that goes the opposite
directions and essentially says I don't really care whether this view
is leaky from a security perspective.  But presumably we don't want
to deliver that behavior by default and require the user to ask for a
SECURITY VIEW to get something else - if anything, we'd want CREATE
VIEW to create the normal (secure) version and add CREATE LEAKY VIEW
to do the other thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 1, 2010 at 10:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 CREATE SECURITY VIEW, anyone?

 That may be the best approach, but I think it needs more than one line
 of exposition.  The approach I proposed was to test whether the user
 has privileges to execute the underlying query directly without going
 through the view.  If so, we needn't be concerned.  If not, then we
 start thinking about which functions/operators we trust.

Ummm ... that makes semantics dependent on the permissions available at
plan time, whereas what should matter is the permissions that exist at
execution time.  Maybe that's all right for this context but it doesn't
seem tremendously desirable.

 Perhaps there is some value to having a knob that goes the opposite
 directions and essentially says I don't really care whether this view
 is leaky from a security perspective.  But presumably we don't want
 to deliver that behavior by default and require the user to ask for a
 SECURITY VIEW to get something else - if anything, we'd want CREATE
 VIEW to create the normal (secure) version and add CREATE LEAKY VIEW
 to do the other thing.

-1 on that.  We will get far more pushback from people whose application
performance suddenly went to hell than we will ever get approval from
people who actually need the feature.  Considering that we've survived
this long with leaky views, that should definitely remain the default
behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Robert Haas
On Tue, Jun 1, 2010 at 1:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 1, 2010 at 10:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 CREATE SECURITY VIEW, anyone?

 That may be the best approach, but I think it needs more than one line
 of exposition.  The approach I proposed was to test whether the user
 has privileges to execute the underlying query directly without going
 through the view.  If so, we needn't be concerned.  If not, then we
 start thinking about which functions/operators we trust.

 Ummm ... that makes semantics dependent on the permissions available at
 plan time, whereas what should matter is the permissions that exist at
 execution time.  Maybe that's all right for this context but it doesn't
 seem tremendously desirable.

Ugh.  I hope there's a way around that problem because AFAICS the
alternative is a world of hurt.  If we're not allowed to take the
security context into account during planning, then we're going to
have to make worst-case assumptions, which sounds really unpleasant.

 Perhaps there is some value to having a knob that goes the opposite
 directions and essentially says I don't really care whether this view
 is leaky from a security perspective.  But presumably we don't want
 to deliver that behavior by default and require the user to ask for a
 SECURITY VIEW to get something else - if anything, we'd want CREATE
 VIEW to create the normal (secure) version and add CREATE LEAKY VIEW
 to do the other thing.

 -1 on that.  We will get far more pushback from people whose application
 performance suddenly went to hell than we will ever get approval from
 people who actually need the feature.  Considering that we've survived
 this long with leaky views, that should definitely remain the default
 behavior.

Eh, if that's the consensus, it doesn't bother me that much, but it
doesn't really answer the question, either: supposing we add an
explicit concept of a security view, what should its semantics be?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Merlin Moncure
On Tue, Jun 1, 2010 at 1:28 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 1, 2010 at 1:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 1, 2010 at 10:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 CREATE SECURITY VIEW, anyone?

 That may be the best approach, but I think it needs more than one line
 of exposition.  The approach I proposed was to test whether the user
 has privileges to execute the underlying query directly without going
 through the view.  If so, we needn't be concerned.  If not, then we
 start thinking about which functions/operators we trust.

 Ummm ... that makes semantics dependent on the permissions available at
 plan time, whereas what should matter is the permissions that exist at
 execution time.  Maybe that's all right for this context but it doesn't
 seem tremendously desirable.

 Ugh.  I hope there's a way around that problem because AFAICS the
 alternative is a world of hurt.  If we're not allowed to take the
 security context into account during planning, then we're going to
 have to make worst-case assumptions, which sounds really unpleasant.

 Perhaps there is some value to having a knob that goes the opposite
 directions and essentially says I don't really care whether this view
 is leaky from a security perspective.  But presumably we don't want
 to deliver that behavior by default and require the user to ask for a
 SECURITY VIEW to get something else - if anything, we'd want CREATE
 VIEW to create the normal (secure) version and add CREATE LEAKY VIEW
 to do the other thing.

 -1 on that.  We will get far more pushback from people whose application
 performance suddenly went to hell than we will ever get approval from
 people who actually need the feature.  Considering that we've survived
 this long with leaky views, that should definitely remain the default
 behavior.

 Eh, if that's the consensus, it doesn't bother me that much, but it
 doesn't really answer the question, either: supposing we add an
 explicit concept of a security view, what should its semantics be?

have you ruled out: 'create function'? :-)

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Robert Haas
On Tue, Jun 1, 2010 at 4:10 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jun 1, 2010 at 1:28 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 1, 2010 at 1:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 1, 2010 at 10:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 CREATE SECURITY VIEW, anyone?

 That may be the best approach, but I think it needs more than one line
 of exposition.  The approach I proposed was to test whether the user
 has privileges to execute the underlying query directly without going
 through the view.  If so, we needn't be concerned.  If not, then we
 start thinking about which functions/operators we trust.

 Ummm ... that makes semantics dependent on the permissions available at
 plan time, whereas what should matter is the permissions that exist at
 execution time.  Maybe that's all right for this context but it doesn't
 seem tremendously desirable.

 Ugh.  I hope there's a way around that problem because AFAICS the
 alternative is a world of hurt.  If we're not allowed to take the
 security context into account during planning, then we're going to
 have to make worst-case assumptions, which sounds really unpleasant.

 Perhaps there is some value to having a knob that goes the opposite
 directions and essentially says I don't really care whether this view
 is leaky from a security perspective.  But presumably we don't want
 to deliver that behavior by default and require the user to ask for a
 SECURITY VIEW to get something else - if anything, we'd want CREATE
 VIEW to create the normal (secure) version and add CREATE LEAKY VIEW
 to do the other thing.

 -1 on that.  We will get far more pushback from people whose application
 performance suddenly went to hell than we will ever get approval from
 people who actually need the feature.  Considering that we've survived
 this long with leaky views, that should definitely remain the default
 behavior.

 Eh, if that's the consensus, it doesn't bother me that much, but it
 doesn't really answer the question, either: supposing we add an
 explicit concept of a security view, what should its semantics be?

 have you ruled out: 'create function'? :-)

You lost me...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Merlin Moncure
On Tue, Jun 1, 2010 at 4:57 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 1, 2010 at 4:10 PM, Merlin Moncure mmonc...@gmail.com wrote:
 have you ruled out: 'create function'? :-)

 You lost me...

Well, as noted by the OP, using views for security in postgres is
simply wishful thinking.  This is part of a family of issues
(generally not evil nor fixable) under the category of 'there is no
real control over when functions in a query fire'.

My point was that in cases where users expect this behavior, why not
encourage them to use functions instead of views?  Is there any formal
expectation that views can be used to hide data in this way?  Does
this really have to be fixed, and if so should it be in light of the
fact that our rule system is basically understood to be broken?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread KaiGai Kohei
(2010/06/01 22:16), Robert Haas wrote:
 2010/6/1 Heikki Linnakangasheikki.linnakan...@enterprisedb.com:
 The general problem is that it seems like a nightmare to maintain this
 throughout the planner. Who knows what optimizations this affects, and
 do we need to hide things like row-counts in EXPLAIN output? If we try
 to be very strict, we can expect a stream of CVEs and security releases
 in the future while we find holes and plug them. On the other hand,
 using views to restrict access to underlying tables is a very useful
 feature, so I'd hate to just give up. We need to decide what level of
 isolation we try to accomplish.
 
 I'm entirely uninspired by the idea of trying to prevent all possible
 leakage of information via side-channels.  Even if we tried to
 obfuscate the explain output, the user can still potentially get some
 information by timing how long queries take to execute.  I think if we
 have a hook function that can prevent certain users from running
 EXPLAIN altogether (and I believe this may already be the case) that's
 about the appropriate level of worrying about that case.  I think the
 only thing that we can realistically prevent is allowing users to make
 off with the actual tuples.
 
It is a good point, I think.

Even if we can guess or estimate something from circumstances, it does not
allow unprivileged users to read information directly.
In fact, we cannot find such a functionality to prevent side-channel leaks
from the certification reports of commercial RDBMS with RLS (e.g; Oracle
Label Security).

However, the leaky VIEWs has a different characteristic.
It unintentionally allows to fetch contents of invisible tuples and move
into into other tables. It means here is a data flow channel (not side channel),
but it breaks restriction of security views.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread KaiGai Kohei
(2010/06/01 22:09), Robert Haas wrote:
 2010/6/1 KaiGai Koheikai...@ak.jp.nec.com:
 I have an idea that we add FuncExpr a new field (e.g nestlevel) to remember
 where is originally put in the query, and prevent reordering over the nest
 level of subqueries.
 In above example, f_malicious() has nestlevel=0 because it is put on the top
 level.
 But f_policy() has nestlevel=1 because it is originally put on the second
 level subquery. Then, the order_qual_clauses() will check nestlevel of the
 scan filter prior to reorder them based on the cost estimation.
 Even if we have multiple nestlevels, solution will be same. A FuncExpr with
 larger nestlevel shall be invoked earlier than others.
 [...]
 My idea is similar to what I proposed at [1]. It adds a new field into
 RelOptInfo (or other structure?) to remember the original nestlevel of
 the scan, then it will be compared to nestlevel of the FuncExpr.
 If nestlevel of the FuncExpr is smaller than nestlevel of the RelOptInfo,
 it prevents to distribute the FuncExpr onto the RelOptInfo, even if the
 function depends on only the relation of RelOptInfo.
 
 Keep in mind that users who are NOT using a view as a security barrier
 don't expect it to kill performance.  This approach, and particularly
 the second part, about preventing quals from being pushed through
 joins, has the potential to be a performance disaster.  So I think
 it's absolutely critical that we don't do that except when it's
 necessary to prevent a security issue.
 
Yes, I agree. It is necessary to distinguish security conscious views
and others. In general, only creator of the view knows its intention
correctly, so I think it is reasonable suggestion to provide a hint
whether we should prevent a part of optimization, or not.

 On the technical side, I am pretty doubtful that the approach of
 adding a nestlevel to FuncExpr and RelOptInfo is the right way to go.
 I believe we have existing code (to handle left joins) that prevents
 quals from being pushed down too far by fudging the set of relations
 that are supposedly needed to evaluate the qual.  I suspect a similar
 approach would work here.
 
 I think the steps here are:
 
 1. Decide whether the view is a security barrier (perhaps, check
 whether the user has sufficient privs to execute the underlying query;
 or we could add an explicit setting).  If not, stop.

I'm a fun of an explicit setting.
Queries are optimized prior to execution stage.

 2. Decide whether each qual executes potentially untrusted code (algorithm?).

A simple idea is to assume all the FuncExpr being potentially untrusted
as a starting up of the fix.
(Can we trust all the built-in functions? It needs to ensure they don't
have any side-effects; in future versions also.)

 3. Prevent any untrusted quals from being pushed down into view that
 is a security barrier.
 
 We should have a design for each of these before we start coding.
 

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread KaiGai Kohei
(2010/06/02 2:28), Robert Haas wrote:
 On Tue, Jun 1, 2010 at 1:02 PM, Tom Lanet...@sss.pgh.pa.us  wrote:
 Robert Haasrobertmh...@gmail.com  writes:
 On Tue, Jun 1, 2010 at 10:57 AM, Tom Lanet...@sss.pgh.pa.us  wrote:
 CREATE SECURITY VIEW, anyone?

 That may be the best approach, but I think it needs more than one line
 of exposition.  The approach I proposed was to test whether the user
 has privileges to execute the underlying query directly without going
 through the view.  If so, we needn't be concerned.  If not, then we
 start thinking about which functions/operators we trust.

 Ummm ... that makes semantics dependent on the permissions available at
 plan time, whereas what should matter is the permissions that exist at
 execution time.  Maybe that's all right for this context but it doesn't
 seem tremendously desirable.
 
 Ugh.  I hope there's a way around that problem because AFAICS the
 alternative is a world of hurt.  If we're not allowed to take the
 security context into account during planning, then we're going to
 have to make worst-case assumptions, which sounds really unpleasant.
 
I was reminded that inline_set_returning_function() tries to extract
a given RangeTblEntry with RTE_FUNCTION into a subquery when a few
conditions are satisfied. The conditions include whether user has
privileges to execute the function.

It seems to me planner checks permissions, and going to have worst
case assumptions, if access privilege violations.

As long as we can resolve the problem that I described at [1] (order
of evaluation of scan filters), it seems to me a reasonable fallback.
(Although I mentioned that queries are optimized prior to execution stage...)

 Perhaps there is some value to having a knob that goes the opposite
 directions and essentially says I don't really care whether this view
 is leaky from a security perspective.  But presumably we don't want
 to deliver that behavior by default and require the user to ask for a
 SECURITY VIEW to get something else - if anything, we'd want CREATE
 VIEW to create the normal (secure) version and add CREATE LEAKY VIEW
 to do the other thing.

 -1 on that.  We will get far more pushback from people whose application
 performance suddenly went to hell than we will ever get approval from
 people who actually need the feature.  Considering that we've survived
 this long with leaky views, that should definitely remain the default
 behavior.
 
 Eh, if that's the consensus, it doesn't bother me that much, but it
 doesn't really answer the question, either: supposing we add an
 explicit concept of a security view, what should its semantics be?
 

How about a GUC option to provide the default, like default_with_oids?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread Robert Haas
2010/6/1 KaiGai Kohei kai...@ak.jp.nec.com:
 Eh, if that's the consensus, it doesn't bother me that much, but it
 doesn't really answer the question, either: supposing we add an
 explicit concept of a security view, what should its semantics be?

 How about a GUC option to provide the default, like default_with_oids?

Bad idea.  We already have enough problems with GUCs that can create
security problems if they're unexpectedly set to the wrong value.  We
don't need any more.  Anyhow, that's trivia.  The real thing we need
to decide here is to design the security mechanism.  We can change the
syntax to whatever we want very easily.

Here's another thought.  If we're leaning toward explicit syntax to
designate security views (and I do mean IF, since only one person has
signed on to that, even if it is Tom Lane!), then maybe we should
think about ripping out the logic that causes regular views to be
evaluated using the credentials of the view owner rather than the
person selecting from it.  A security view would still use that logic,
plus whatever additional stuff we come up with to prevent leakage.
Perhaps this would be viewed as a nasty backward compatibility break,
but the upside is that we'd then be being absolutely clear that a
non-security view isn't and can never be trusted to be a security
barrier.  Right now we're shipping something that purports to act as a
barrier but really doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-01 Thread KaiGai Kohei
(2010/06/02 10:52), Robert Haas wrote:
 2010/6/1 KaiGai Koheikai...@ak.jp.nec.com:
 Eh, if that's the consensus, it doesn't bother me that much, but it
 doesn't really answer the question, either: supposing we add an
 explicit concept of a security view, what should its semantics be?

 How about a GUC option to provide the default, like default_with_oids?
 
 Bad idea.  We already have enough problems with GUCs that can create
 security problems if they're unexpectedly set to the wrong value.  We
 don't need any more.  Anyhow, that's trivia.  The real thing we need
 to decide here is to design the security mechanism.  We can change the
 syntax to whatever we want very easily.
 
Indeed, syntax will be decided according to the logic.

 Here's another thought.  If we're leaning toward explicit syntax to
 designate security views (and I do mean IF, since only one person has
 signed on to that, even if it is Tom Lane!), then maybe we should
 think about ripping out the logic that causes regular views to be
 evaluated using the credentials of the view owner rather than the
 person selecting from it.  A security view would still use that logic,
 plus whatever additional stuff we come up with to prevent leakage.
 Perhaps this would be viewed as a nasty backward compatibility break,
 but the upside is that we'd then be being absolutely clear that a
 non-security view isn't and can never be trusted to be a security
 barrier.  Right now we're shipping something that purports to act as a
 barrier but really doesn't.
 

Sorry, should we make clear the purpose of explicit syntax for security
views being issued now?
In my understanding, if security views, the planner tries to checks
privileges of the person selecting it to reference underlaying tables
without any ereport. If violated, the planner prevents user given
quals (perhaps FuncExpr only?) to come into inside of the join scan.
Otherwise, if regular views, the planner works as is. Right?

I don't think we need whatever additional user visible stuff to prevent
leakage except for fully optimized query plan. (Of course, it can make
performance regression.)
It seems to me the issue is just an order to execute user defined
functions and qualifier of security views to restrict visible tuples,
so I don't know whether it breaks any backward compatibility.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers