Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-08 Thread KaiGai Kohei
(2010/06/08 11:15), Robert Haas wrote:
> 2010/6/7 KaiGai Kohei:
>> Our headache is on functions categorized to middle-threat. It enables to
>> leak the given arguments using error messages. Here are several ideas,
>> but they have good and bad points.
> 
> I think we are altogether off in the weeds here.  We ought to start
> with an implementation that pushes nothing down, and then try to
> figure out how much we can relax that without too much compromising
> security.
> 

The attached patch tries to prevent pushing down anything into subqueries
from outside of them.

The distribute_qual_to_rels() tries to distribute the given qualifier
into a certain scanning-plan based on the dependency of qualifier.

E.g) SELECT * FROM (SELECT * FROM t1 JOIN t2 ON t1.a = t2.x WHERE 
f_policy(t1.a)) WHERE f_user(t2.x);

In this case, f_user() function depends on only t2 table, so it is
reasonable to attach on the scanning plan of t2 from perspective of
performance.

However, f_user() may have a side-effect which writes arguments into
somewhere. If here is such a possibility, f_user() should not be called
before the joined tuples being filtered by f_policy().

In the case when we can ensure all functions within the qualifier are
enough trustable, we don't need to prevent them to be pushed down.
But the algorithm to determine it is under discussion. So, right now,
we prevent all the possible pushing down.

Example.1)  CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON a = x WHERE 
f_policy(a);
SELECT * FROM v1 WHERE f_malicious(b);

 * without this patch
postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
QUERY PLAN
---
 Hash Join  (cost=639.01..667.29 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=637.30..637.30 rows=137 width=36)
 ->  Seq Scan on t1  (cost=0.00..637.30 rows=137 width=36)
   Filter: (f_policy(a) AND f_malicious(b))
(6 rows)

 * with this patch
postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
QUERY PLAN
---
 Hash Join  (cost=334.93..468.44 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   Join Filter: f_malicious(t1.b)
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=329.80..329.80 rows=410 width=36)
 ->  Seq Scan on t1  (cost=0.00..329.80 rows=410 width=36)
   Filter: f_policy(a)
(7 rows)

It prevents to push down f_malicious() inside of the join loop.


Example.2)  CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON a = x WHERE 
f_policy(a);
SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE f_malicious(b);

  * without this patch
postgres=# EXPLAIN SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE 
f_malicious(b);
  QUERY PLAN

---
 Hash Join  (cost=669.01..697.29 rows=137 width=108)
   Hash Cond: (t3.s = t1.a)
   ->  Seq Scan on t3  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=667.29..667.29 rows=137 width=72)
 ->  Hash Join  (cost=639.01..667.29 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=637.30..637.30 rows=137 width=36)
 ->  Seq Scan on t1  (cost=0.00..637.30 rows=137 
width=36)
   Filter: (f_policy(a) AND f_malicious(b))
(10 rows)

  * with this patch
postgres=# EXPLAIN SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE 
f_malicious(b);
  QUERY PLAN

---
 Hash Join  (cost=470.15..498.43 rows=137 width=108)
   Hash Cond: (t3.s = t1.a)
   ->  Seq Scan on t3  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=468.44..468.44 rows=137 width=72)
 ->  Hash Join  (cost=334.93..468.44 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   Join Filter: f_malicious(t1.b)
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=329.80..329.80 rows=410 width=36)
 ->  Seq Scan on t1  (cost=0.00..329.80 rows=410 
width=36)
   Filter: f_policy(a)
(11 rows)

It also prevents f_malisious() to be pushed down into the join loop within view,
but we can push it down into same level of the query.


Please note that it specially handles equality operator at the bottom half of
the distribute_qual_to_rels(), so this patch does not care about these cases.
However, I'm not in hust

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 11:28), Stephen Frost wrote:
> For the sake of clarity..
> 
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>> OK, it was too implementation-specific.
> 
> No, that wasn't the problem.  There isn't an actual implementation yet
> for it to be too-specific on.  The problem is that proposing a change to
> the catalog without figuring out what it'd actually be used for in an
> overall solution is a waste of time.
> 
Indeed,

>> Please return to the categorization with 3-level that I mentioned at
>> the previous message.
> 
> As Robert said, we're off in the weeds here.  I'm not convinced that
> we've got 3 levels, for starters.  It could well be fewer, or more.
> Let's stop making assumptions about what's OK and what's not OK.
> 
Indeed, we may find out the 4th category in the future.

>> For built-in functions, the code should be reviewed to ensure it does not
>> expose the given argument using error messages.
>> Then, we can mark it as trusted.
> 
> One thing that I think *is* clear- removing useful information from
> error messages is *not* going to be an acceptable "solution".
> 
Even if it is conditional, like as Greg Stark suggested?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
For the sake of clarity..

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> OK, it was too implementation-specific.

No, that wasn't the problem.  There isn't an actual implementation yet
for it to be too-specific on.  The problem is that proposing a change to
the catalog without figuring out what it'd actually be used for in an
overall solution is a waste of time. 

> Please return to the categorization with 3-level that I mentioned at
> the previous message.

As Robert said, we're off in the weeds here.  I'm not convinced that
we've got 3 levels, for starters.  It could well be fewer, or more.
Let's stop making assumptions about what's OK and what's not OK.

> For built-in functions, the code should be reviewed to ensure it does not
> expose the given argument using error messages.
> Then, we can mark it as trusted.

One thing that I think *is* clear- removing useful information from
error messages is *not* going to be an acceptable "solution".

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 11:15), Robert Haas wrote:
> 2010/6/7 KaiGai Kohei:
>> Our headache is on functions categorized to middle-threat. It enables to
>> leak the given arguments using error messages. Here are several ideas,
>> but they have good and bad points.
> 
> I think we are altogether off in the weeds here.  We ought to start
> with an implementation that pushes nothing down, and then try to
> figure out how much we can relax that without too much compromising
> security.
> 
It seems to me fair enough.
I think we can adjust what functions are harmless, and whats are not later.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> 2010/6/7 KaiGai Kohei :
> > Our headache is on functions categorized to middle-threat. It enables to
> > leak the given arguments using error messages. Here are several ideas,
> > but they have good and bad points.
> 
> I think we are altogether off in the weeds here.  We ought to start
> with an implementation that pushes nothing down, and then try to
> figure out how much we can relax that without too much compromising
> security.

I agree with this- and it's more-or-less what I was trying to propose in
my previous comments.  I'm not even sure we need to focus on not pushing
anything down at this point- I'd start with trying to get enough
information passed around/through the system to even *identify* the case
where there's a problem in the first place..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Robert Haas
2010/6/7 KaiGai Kohei :
> Our headache is on functions categorized to middle-threat. It enables to
> leak the given arguments using error messages. Here are several ideas,
> but they have good and bad points.

I think we are altogether off in the weeds here.  We ought to start
with an implementation that pushes nothing down, and then try to
figure out how much we can relax that without too much compromising
security.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 10:17), Stephen Frost wrote:
> KaiGai,
> 
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>> Perhaps, pg_proc takes a new flag to represent it.
> 
> Without an actual well-formed approach for dealing with this which
> requires it, it's far too soon to be asking for changes in the catalog.
> Please stop suggesting individual maybe-this-would-help changes.  We
> need an actual solution which addresses all, or at least most, of the
> concerns described, followed by a patch which implements the mechanics
> of the solution.  If the solution calls for an additional pg_proc field,
> then the patch should include it.
> 
OK, it was too implementation-specific.

Please return to the categorization with 3-level that I mentioned at
the previous message.

I guess nobody opposes to disable pushing down on functions categorized
to high-threat, such as lowrite() and so on.
It actually can move given arguments to other tables, files, ...

And, I guess nobody wants to tackle an issue categorized to low-threat,
such as EXPLAIN, PK/FK constraints and so on.
It can imply existence of invisible objects, but no leaks of actual value.


Our headache is on functions categorized to middle-threat. It enables to
leak the given arguments using error messages. Here are several ideas,
but they have good and bad points.

At first, it is necessary whether we should fix up the threat in this
level, or not. It seems to me majority of the -hackers want to fix both
of high and middle level.

If we should fix up the issue, I think we have only two options semantically.

(A) It prevents leaky functions to be pushed down, so no invisible
information will be provided. But it makes performance regression.

(B) It prevents leaky functions to raise an error, although we allow
it to be pushed down. But is needs large scale of code changes.

Of course, it has trade-off. As TCSEC mentioned, we are facing with the
large potential cost of reducing the bandwidths of such covert channel

> Not sure if this would translate well, but asking for new flags to be
> added to pg_proc right now is putting the cart before the horse.  We
> don't even know which functions we might mark as trusted or not yet, nor
> is it even clear that adding such a flag would actually help.  Adding a
> flag to pg_proc is certainly nothing like a solution to this problem by
> itself.
>
For built-in functions, the code should be reviewed to ensure it does not
expose the given argument using error messages.
Then, we can mark it as trusted.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> Perhaps, pg_proc takes a new flag to represent it.

Without an actual well-formed approach for dealing with this which
requires it, it's far too soon to be asking for changes in the catalog.
Please stop suggesting individual maybe-this-would-help changes.  We
need an actual solution which addresses all, or at least most, of the
concerns described, followed by a patch which implements the mechanics
of the solution.  If the solution calls for an additional pg_proc field,
then the patch should include it.

Not sure if this would translate well, but asking for new flags to be
added to pg_proc right now is putting the cart before the horse.  We
don't even know which functions we might mark as trusted or not yet, nor
is it even clear that adding such a flag would actually help.  Adding a
flag to pg_proc is certainly nothing like a solution to this problem by
itself.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 9:46), Tom Lane wrote:
> KaiGai Kohei  writes:
>> In this case, is it unnecessary to expose the given argument in
>> the error message (from security perspective), isn't it?
> 
> Yes, if all you care about is security and not usability, that looks
> like a great solution.  We're *not* doing it.
> 
Sorry, are you saying we should not revise error messages because
of usability??

If so, and if we decide the middle-threat also should be fixed,
it is necessary to distinguish functions trusted and untrusted,
even if a function is built-in.
Perhaps, pg_proc takes a new flag to represent it.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Greg Stark
On Tue, Jun 8, 2010 at 1:46 AM, Tom Lane  wrote:
> KaiGai Kohei  writes:
>> In this case, is it unnecessary to expose the given argument in
>> the error message (from security perspective), isn't it?
>
> Yes, if all you care about is security and not usability, that looks
> like a great solution.  We're *not* doing it.

It's possible to take a more nuanced angle on this approach. You could
imagine a flag indicating whether the call is within a SECURE VIEW
which if enabled caused error messages to elide data taken from
arguments. In order for this not to destroy the code legibility I
think you would need to design some new %-escapes for error messages
to indicate such arguments or some similar trick. You wouldn't want to
have to put extra conditionals everywhere. And I'm not sure where to
conveniently put such a flag so it would have the right semantics.

It would still be a lot of work and a big patch, but mostly mechanical
and it wouldn't impact usability for any case where it wasn't
necessary. It would still have the problem described earlier that we
would keep finding new omissions for years. I can't see us
implementing variable taint tracking in C to do it automatically
though. Perhaps we could get it from one of the static analysis tools.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 9:23), KaiGai Kohei wrote:
> (2010/06/07 21:56), Stephen Frost wrote:
>> * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
>>> WHERE should do it:
>>>
>>> SELECT * FROM secrets_view WHERE username = 'neighbor' AND
>>> password::integer = 1234;
>>> ERROR:  invalid input syntax for integer: "neighborssecretpassword"
>>>
>>> Assuming that username = 'neighbor' is evaluated before the cast.
>>
>> Fair enough, so we can't allow built-ins either, except perhaps in very
>> specific/limited situations.  Still, if we track that the above WHERE
>> and password::integer calls *should* be run as "role X", while the view
>> should run as "role Y", maybe we can at least identify the case where
>> we've ended up in a situation where we are going to expose unintended
>> data.  I don't know enough about the optimizer or the planner to have
>> any clue how we might teach them to actually avoid doing such, though I
>> certainly believe it could end up being a disaster on performance based
>> on comments from others who know better. :)
>>
> 
> My opinion is that it is a matter in individual functions, not optimizer.
> Basically, built-in functions *should* be trusted, because our security
> mechanism is not designed to prevent anything from malicious internal
> binary modules.
> 
Sorry, it does not mean *all* the built-in functions could be trusted.
Some of built-in ones cannot be "trusted" from definitions, such as
lowrite().

Perhaps, it eventually needs a flag in the pg_proc to mark a function
being either trusted or untrusted. Then, planner may be able to check
the flag to decide whether is can be pushed down, or not.

If so, we can mark int4in() as trusted, when we revise the issue of
error message. I think the idea makes this problem more simple.

Thanks,

> Historically, we have not known the risk to leak invisible information
> using error messages for a long time, so most of internal functions have
> not been designed not to return users unnecessary information.
> If so, it needs to revise error messages, but it will not complete with
> a single commit.
> 
> Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Tom Lane
KaiGai Kohei  writes:
> In this case, is it unnecessary to expose the given argument in
> the error message (from security perspective), isn't it?

Yes, if all you care about is security and not usability, that looks
like a great solution.  We're *not* doing it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/07 21:56), Stephen Frost wrote:
> * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
>> WHERE should do it:
>>
>> SELECT * FROM secrets_view WHERE username = 'neighbor' AND
>> password::integer = 1234;
>> ERROR:  invalid input syntax for integer: "neighborssecretpassword"
>>
>> Assuming that username = 'neighbor' is evaluated before the cast.
> 
> Fair enough, so we can't allow built-ins either, except perhaps in very
> specific/limited situations.  Still, if we track that the above WHERE
> and password::integer calls *should* be run as "role X", while the view
> should run as "role Y", maybe we can at least identify the case where
> we've ended up in a situation where we are going to expose unintended
> data.  I don't know enough about the optimizer or the planner to have
> any clue how we might teach them to actually avoid doing such, though I
> certainly believe it could end up being a disaster on performance based
> on comments from others who know better. :)
> 

My opinion is that it is a matter in individual functions, not optimizer.
Basically, built-in functions *should* be trusted, because our security
mechanism is not designed to prevent anything from malicious internal
binary modules.

Historically, we have not known the risk to leak invisible information
using error messages for a long time, so most of internal functions have
not been designed not to return users unnecessary information.
If so, it needs to revise error messages, but it will not complete with
a single commit.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/07 20:53), Heikki Linnakangas wrote:
> On 07/06/10 14:06, Stephen Frost wrote:
>> * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
>>> The big difference is what information can be obtained, not how fast it
>>> can be obtained.
>>
>> Actually, I disagree. Time required to acquire the data does matter.
> 
> Depends on the magnitude, of course. If it takes 1 year per row, that's 
> probably acceptable. If it takes 1 second, that's extremely slow 
> compared to normal queries, but most likely still disastreous from a 
> security point of view.
> 
FYI, the classic also mentioned about bandwidth of covert channel,
although it was already obsoleted.

See the page.80 of:
  http://csrc.nist.gov/publications/history/dod85.pdf

It said 1bit/sec are acceptable on DoD in 25 years ago.

>>> Imagine a table that holds username/passwords for users. Each user is
>>> allowed to see his own row, including password, but not anyone else's.
>>> EXPLAIN side-channel might give pretty accurate information of how many
>>> rows there is in the table, and via clever EXPLAIN+statistics probing
>>> you might be able to find out what the top-10 passwords are, for
>>> example. But if you wanted to know what your neighbor's password is, the
>>> side-channels would not help you much, but an error message would reveal
>>> it easily.
>>
>> Using only built-ins, could you elaborate on how one could pick exactly
>> what row was revealed using an error case? That strikes me as
>> difficult, but perhaps I'm not thinking creatively enough.
> 
> WHERE should do it:
> 
> SELECT * FROM secrets_view WHERE username = 'neighbor' AND 
> password::integer = 1234;
> ERROR: invalid input syntax for integer: "neighborssecretpassword"
> 
> Assuming that username = 'neighbor' is evaluated before the cast.
> 

In this case, is it unnecessary to expose the given argument in
the error message (from security perspective), isn't it?
Because it is basically matter of the integer input handler,
it seems to me what we should fix up is int4in(), not optimizer.

Perhaps, we should categorize the issued functionalities base on
the level of its threat when abused.

* High-threat
Functions have side-effect that allows to move the given arguments
into another tables or other high-bandwidth chennel.
E.g) lowrite(), pg_write_file()

 -> It should be fixed soon.

* Middle-threat
Functions have side-effect that allows to move the given arguments
using error messages or other low-bandwidth channel.
E.g) int4in()

 -> It should be fixed in long term.

* Row-threat
Functions can imply existence of invisible tuples, but it does not
expose the value itself.
E.g) EXPLAIN statement, PK/FK constraints

 -> It should not be fixed in PostgreSQL.

Now we allow all of them.
But isn't it valuable to fix the high-threat first?
Then, we can revise error messages in built-in functions, I think.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
> WHERE should do it:
>
> SELECT * FROM secrets_view WHERE username = 'neighbor' AND  
> password::integer = 1234;
> ERROR:  invalid input syntax for integer: "neighborssecretpassword"
>
> Assuming that username = 'neighbor' is evaluated before the cast.

Fair enough, so we can't allow built-ins either, except perhaps in very
specific/limited situations.  Still, if we track that the above WHERE
and password::integer calls *should* be run as "role X", while the view
should run as "role Y", maybe we can at least identify the case where
we've ended up in a situation where we are going to expose unintended
data.  I don't know enough about the optimizer or the planner to have
any clue how we might teach them to actually avoid doing such, though I
certainly believe it could end up being a disaster on performance based
on comments from others who know better. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas

On 07/06/10 14:06, Stephen Frost wrote:

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:

The big difference is what information can be obtained, not how fast it
can be obtained.


Actually, I disagree.  Time required to acquire the data does matter.


Depends on the magnitude, of course. If it takes 1 year per row, that's 
probably acceptable. If it takes 1 second, that's extremely slow 
compared to normal queries, but most likely still disastreous from a 
security point of view.



Imagine a table that holds username/passwords for users. Each user is
allowed to see his own row, including password, but not anyone else's.
EXPLAIN side-channel might give pretty accurate information of how many
rows there is in the table, and via clever EXPLAIN+statistics probing
you might be able to find out what the top-10 passwords are, for
example. But if you wanted to know what your neighbor's password is, the
side-channels would not help you much, but an error message would reveal
it easily.


Using only built-ins, could you elaborate on how one could pick exactly
what row was revealed using an error case?  That strikes me as
difficult, but perhaps I'm not thinking creatively enough.


WHERE should do it:

SELECT * FROM secrets_view WHERE username = 'neighbor' AND 
password::integer = 1234;

ERROR:  invalid input syntax for integer: "neighborssecretpassword"

Assuming that username = 'neighbor' is evaluated before the cast.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
Heikki,

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
> The big difference is what information can be obtained, not how fast it
> can be obtained.

Actually, I disagree.  Time required to acquire the data does matter.

> Imagine a table that holds username/passwords for users. Each user is
> allowed to see his own row, including password, but not anyone else's.
> EXPLAIN side-channel might give pretty accurate information of how many
> rows there is in the table, and via clever EXPLAIN+statistics probing
> you might be able to find out what the top-10 passwords are, for
> example. But if you wanted to know what your neighbor's password is, the
> side-channels would not help you much, but an error message would reveal
> it easily.

Using only built-ins, could you elaborate on how one could pick exactly
what row was revealed using an error case?  That strikes me as
difficult, but perhaps I'm not thinking creatively enough.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas
On 07/06/10 10:30, KaiGai Kohei wrote:
> (2010/06/07 15:48), Heikki Linnakangas wrote:
>> There's many side channels like exposing row counts in EXPLAIN and
>> statistics and timing attacks, that are not as critical, because they
>> don't let expose all data, and the attacker can't accurately choose what
>> data is exposed. Those are not as important.
>>
> It also means; because they can provide much smaller bandwidth to leak
> invisible information than error messages, these are not as important.
> Is it right?

The big difference is what information can be obtained, not how fast it
can be obtained.

Imagine a table that holds username/passwords for users. Each user is
allowed to see his own row, including password, but not anyone else's.
EXPLAIN side-channel might give pretty accurate information of how many
rows there is in the table, and via clever EXPLAIN+statistics probing
you might be able to find out what the top-10 passwords are, for
example. But if you wanted to know what your neighbor's password is, the
side-channels would not help you much, but an error message would reveal
it easily.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/07 15:48), Heikki Linnakangas wrote:
> On 07/06/10 06:06, Stephen Frost wrote:
>> Also, perhaps I'm not being paranoid enough, but all this concern over
>> error cases really doesn't really worry me that much. The amount of
>> data one could acquire that way is pretty limited.
> 
> It's not limited. It allows you to read all contents of the underlying 
> table or tables. I don't see much point doing anything at all if we 
> don't plug that.
> 
IIRC, Oracle pays attention not to expose function's arguments via
error messages due to the security reasons, although it focuses on
that does not provide a hint to attacker who tries SQL injection.

It seems to me it is a matter of degree.

If we allows to execute lowrite() inside of the join loop, it can
be abused to move massive invisible information into visible large
objects within a little time. So, its priority is relatively higher
than error messages.

However, we cannot move massive information via error messages in
a little time, although it may expose whole of the arguments.
So, its threat is relatively smaller.

> There's many side channels like exposing row counts in EXPLAIN and 
> statistics and timing attacks, that are not as critical, because they 
> don't let expose all data, and the attacker can't accurately choose what 
> data is exposed. Those are not as important.
> 
It also means; because they can provide much smaller bandwidth to leak
invisible information than error messages, these are not as important.
Is it right?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread Heikki Linnakangas

On 07/06/10 06:06, Stephen Frost wrote:

Also, perhaps I'm not being paranoid enough, but all this concern over
error cases really doesn't really worry me that much.  The amount of
data one could acquire that way is pretty limited.


It's not limited. It allows you to read all contents of the underlying 
table or tables. I don't see much point doing anything at all if we 
don't plug that.


There's many side channels like exposing row counts in EXPLAIN and 
statistics and timing attacks, that are not as critical, because they 
don't let expose all data, and the attacker can't accurately choose what 
data is exposed. Those are not as important.



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread KaiGai Kohei
(2010/06/07 12:06), Stephen Frost wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
>> Another idea I had was... would it be safe to trust functions defined
>> by the same user who owns the view?  If he's granted access to the
>> view and the function to some other user, presumably he doesn't mind
>> them being used together?  Or is that too optimistic?
> 
> This was more-or-less what I'd been kind of kicking around in my head.
> Forget about functions that are defined in the view itself.  Any other
> functions, etc, which are attached to the view by the calling user would
> be suspect, etc.  Perhaps with the exception of some built-ins that
> we've marked as "safe" in some way.
> 
> My first thought was to track the "run this as X" information on every
> RTE (more-or-less, relations, function calls, etc) and then at least be
> able to, hopefully, *detect* situations that might be a problem- eg:
> running a function which has "run as Q" against a relation that was
> accessed as "run as R" when a filter "run as R" happens later.  This is
> all far too hand-wavey, I'm sure, but at least if we could detect it
> then we might be able to find a way to deal with it.
> 
It is a similar idea to what I tried to implement at the conceptual
patch. It checks privileges of calling user on the underlaying tables
to be referenced using views. If violated, it prevents to push down
the user given FuncExpr into join loops of views. Otherwise, it does
not prevent anything, because the user originally has privileges to
reference them.
Are you suggesting the idea (with adjustments such as "safe" functions)?

> Also, perhaps I'm not being paranoid enough, but all this concern over
> error cases really doesn't really worry me that much.  The amount of
> data one could acquire that way is pretty limited.  It'd be great if we
> could deal with that case too, but maybe we could worry about the bigger
> issue (at least, as I see it) first.
> 
As I also mentioned at previous message. It seems to me a good
point to focus on bandwidth of the covert channel.

The error message indeed can deliver information to be invisible,
but I don't think its priority is higher than functions that can
be abused to direct data flow channel, such as lowrite().

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread KaiGai Kohei
(2010/06/07 10:38), Robert Haas wrote:
> On Fri, Jun 4, 2010 at 4:12 PM, Tom Lane  wrote:
>> Heikki Linnakangas  writes:
>>> On 04/06/10 22:33, Tom Lane wrote:
 A counterexample: suppose we had a form of type "text" that carried a
 collation specifier internally, and the comparison routine threw an
 error if asked to compare values with incompatible specifiers.  An index
 built on a column of all the same collation would work fine.  A query
 that tried to compare against a constant of a different collation would
 throw an error.
>>
>>> I can't take that example seriously. First of all, tacking a collation
>>> specifier to text values would be an awful hack.
>>
>> Really?  I thought that was under serious discussion.  But whether it
>> applies to text or not is insignificant; I believe there are cases just
>> like this in existence today for some datatypes (think postgis).
>>
>> The real point is that the comparison constant is under the control of
>> the attacker, and it's not part of the index.  Therefore "it didn't
>> throw an error during index construction" proves nothing whatever.
>>
>>> ... Secondly, it would be a
>>> bad idea to define the b-tree comparison operators to throw an error;
>>
>> You're still being far too trusting, by imagining that only *designed*
>> error conditions matter here.  Think about overflows, out-of-memory,
>> (perhaps intentionally) corrupted data, etc etc.
> 
> btree comparison operators should handle overflow and corrupted data
> without blowing up.  Maybe out-of-memory is worth worrying about, but
> I think that is a mighty thin excuse for abandoning this feature
> altogether.  You'd have to contrive a situation where the system was
> just about out of memory and something about the value being compared
> against resulted in the comparison blowing up or not.  I think that's
> likely to be rather hard in practice, and in any case it's a covert
> channel attack, which I think everyone already agrees is beyond the
> scope of what we can protect against.  You can probably learn more
> information more quickly about the unseen data by fidding with
> EXPLAIN, analyzing query execution times, etc.  As long as we are
> preventing the actual contents of the unseen tuples from being
> revealed, I feel reasonably good about it.
> 
It seems to me a good point. In general, any software have
possibility to leak information via cover-channels, and it is
nearly impossible to fix all the channels.
However, from a security perspective, covert channels with low
bandwidths represent a lower threat than those with high bandwidths.
So, evaluation criteria stands on the viewpoint that it does not
necessary to eliminate all the covert channels more than necessity.

Even if we can estimate the contents of invisible tuples from error
messages or EXPLAIN output, its bandwidths are quite limited.

But, lowrite() might write out the given argument somewhere, if it
is invoked prior to security policy functions.
In this case, the contents of invisible tuples will be moved to
another large object unexpectedly with unignorable speed.
Such a direct data flow channel should be fixed at first, because of
its large bandwidth.

>> I think the only real fix would be something like what Marc suggested:
>> if there's a security view involved in the query, we simply don't give
>> the client the real error message.  Of course, now our "security
>> feature" is utterly disastrous on usability as well as performance
>> counts ...
> 
> Not pushing anything into the view is an equally real fix, although
> I'll be pretty depressed if that's where we end up.
> 
I could not find out any certified commercial RDBMS with functionality
to tackle covert channel. It includes Oracle's row-level security stuff.
So, I don't think it is our goal to prevent anything into views.

An idea is that we focus on the direct data flow which allows to move
information to be invisible into another tables, files and so on.
In this stage, I don't think the priority of error messages are high.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> Another idea I had was... would it be safe to trust functions defined
> by the same user who owns the view?  If he's granted access to the
> view and the function to some other user, presumably he doesn't mind
> them being used together?  Or is that too optimistic?

This was more-or-less what I'd been kind of kicking around in my head.
Forget about functions that are defined in the view itself.  Any other
functions, etc, which are attached to the view by the calling user would
be suspect, etc.  Perhaps with the exception of some built-ins that
we've marked as "safe" in some way.

My first thought was to track the "run this as X" information on every
RTE (more-or-less, relations, function calls, etc) and then at least be
able to, hopefully, *detect* situations that might be a problem- eg:
running a function which has "run as Q" against a relation that was
accessed as "run as R" when a filter "run as R" happens later.  This is
all far too hand-wavey, I'm sure, but at least if we could detect it
then we might be able to find a way to deal with it.

Also, perhaps I'm not being paranoid enough, but all this concern over
error cases really doesn't really worry me that much.  The amount of
data one could acquire that way is pretty limited.  It'd be great if we
could deal with that case too, but maybe we could worry about the bigger
issue (at least, as I see it) first.

Just my 2c.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread Robert Haas
On Fri, Jun 4, 2010 at 4:12 PM, Tom Lane  wrote:
> Heikki Linnakangas  writes:
>> On 04/06/10 22:33, Tom Lane wrote:
>>> A counterexample: suppose we had a form of type "text" that carried a
>>> collation specifier internally, and the comparison routine threw an
>>> error if asked to compare values with incompatible specifiers.  An index
>>> built on a column of all the same collation would work fine.  A query
>>> that tried to compare against a constant of a different collation would
>>> throw an error.
>
>> I can't take that example seriously. First of all, tacking a collation
>> specifier to text values would be an awful hack.
>
> Really?  I thought that was under serious discussion.  But whether it
> applies to text or not is insignificant; I believe there are cases just
> like this in existence today for some datatypes (think postgis).
>
> The real point is that the comparison constant is under the control of
> the attacker, and it's not part of the index.  Therefore "it didn't
> throw an error during index construction" proves nothing whatever.
>
>> ... Secondly, it would be a
>> bad idea to define the b-tree comparison operators to throw an error;
>
> You're still being far too trusting, by imagining that only *designed*
> error conditions matter here.  Think about overflows, out-of-memory,
> (perhaps intentionally) corrupted data, etc etc.

btree comparison operators should handle overflow and corrupted data
without blowing up.  Maybe out-of-memory is worth worrying about, but
I think that is a mighty thin excuse for abandoning this feature
altogether.  You'd have to contrive a situation where the system was
just about out of memory and something about the value being compared
against resulted in the comparison blowing up or not.  I think that's
likely to be rather hard in practice, and in any case it's a covert
channel attack, which I think everyone already agrees is beyond the
scope of what we can protect against.  You can probably learn more
information more quickly about the unseen data by fidding with
EXPLAIN, analyzing query execution times, etc.  As long as we are
preventing the actual contents of the unseen tuples from being
revealed, I feel reasonably good about it.

> I think the only real fix would be something like what Marc suggested:
> if there's a security view involved in the query, we simply don't give
> the client the real error message.  Of course, now our "security
> feature" is utterly disastrous on usability as well as performance
> counts ...

Not pushing anything into the view is an equally real fix, although
I'll be pretty depressed if that's where we end up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Tom Lane
Heikki Linnakangas  writes:
> On 04/06/10 22:33, Tom Lane wrote:
>> A counterexample: suppose we had a form of type "text" that carried a
>> collation specifier internally, and the comparison routine threw an
>> error if asked to compare values with incompatible specifiers.  An index
>> built on a column of all the same collation would work fine.  A query
>> that tried to compare against a constant of a different collation would
>> throw an error.

> I can't take that example seriously. First of all, tacking a collation 
> specifier to text values would be an awful hack.

Really?  I thought that was under serious discussion.  But whether it
applies to text or not is insignificant; I believe there are cases just
like this in existence today for some datatypes (think postgis).

The real point is that the comparison constant is under the control of
the attacker, and it's not part of the index.  Therefore "it didn't
throw an error during index construction" proves nothing whatever.

> ... Secondly, it would be a 
> bad idea to define the b-tree comparison operators to throw an error;

You're still being far too trusting, by imagining that only *designed*
error conditions matter here.  Think about overflows, out-of-memory,
(perhaps intentionally) corrupted data, etc etc.

I think the only real fix would be something like what Marc suggested:
if there's a security view involved in the query, we simply don't give
the client the real error message.  Of course, now our "security
feature" is utterly disastrous on usability as well as performance
counts ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Heikki Linnakangas

On 04/06/10 22:33, Tom Lane wrote:

Heikki Linnakangas  writes:

On 04/06/10 17:33, Tom Lane wrote:

Maybe the entire idea is unworkable.  I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?



Let's consider b-tree operators for an index on the secure table, for
starters. Surely a b-tree index comparison operator can't throw an error
on any value that's in the table already, you would've gotten an error
trying to insert that.


Man, are *you* trusting.

A counterexample: suppose we had a form of type "text" that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers.  An index
built on a column of all the same collation would work fine.  A query
that tried to compare against a constant of a different collation would
throw an error.


I can't take that example seriously. First of all, tacking a collation 
specifier to text values would be an awful hack. Secondly, it would be a 
bad idea to define the b-tree comparison operators to throw an error; it 
would be a lot more useful to impose an arbitrary order on the 
collations, so that all values with collation A are considered smaller 
than values with collation B. We do that for types like box; smaller or 
greater than don't make much sense for boxes, but we implement them in a 
pretty arbitrary way anyway to make it possible to build a b-tree index 
on them, and for the planner to use merge joins on them, and implement 
DISTINCT using sort etc.



I'm not sure. But indexable
operations are what we care about the most; the order of executing those
determines if you can use an index scan or not.


Personally, I care just as much about hash and merge join operators...


Hash seems safe too. Don't merge joins just use the default b-tree operator?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Tom Lane
Heikki Linnakangas  writes:
> On 04/06/10 17:33, Tom Lane wrote:
>> Maybe the entire idea is unworkable.  I certainly don't find any comfort
>> in your proposal in the above-referenced message to trust index
>> operators; where is it written that those don't throw errors?

> Let's consider b-tree operators for an index on the secure table, for 
> starters. Surely a b-tree index comparison operator can't throw an error 
> on any value that's in the table already, you would've gotten an error 
> trying to insert that.

Man, are *you* trusting.

A counterexample: suppose we had a form of type "text" that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers.  An index
built on a column of all the same collation would work fine.  A query
that tried to compare against a constant of a different collation would
throw an error.

> I'm not sure. But indexable 
> operations are what we care about the most; the order of executing those 
> determines if you can use an index scan or not.

Personally, I care just as much about hash and merge join operators...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Robert Haas
On Fri, Jun 4, 2010 at 1:46 PM, Heikki Linnakangas
 wrote:
> On 04/06/10 17:33, Tom Lane wrote:
>>
>> Heikki Linnakangas  writes:
>>>
>>> On 04/06/10 07:57, Tom Lane wrote:

 The proposal some time back in this thread was to trust all built-in
 functions and no others.
>>
>>> I thought I debunked that idea already
>>> (http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
>>> all built-in functions are safe. Consider casting integer to text, for
>>> example.
>
> (I meant "text to integer", of course)
>
>> Maybe the entire idea is unworkable.  I certainly don't find any comfort
>> in your proposal in the above-referenced message to trust index
>> operators; where is it written that those don't throw errors?
>
> Let's consider b-tree operators for an index on the secure table, for
> starters. Surely a b-tree index comparison operator can't throw an error on
> any value that's in the table already, you would've gotten an error trying
> to insert that.
>
> Now, is it safe to expand that thinking to b-tree operators in general, even
> if there's no such index on the table? I'm not sure. But indexable
> operations are what we care about the most; the order of executing those
> determines if you can use an index scan or not.

Another idea I had was... would it be safe to trust functions defined
by the same user who owns the view?  If he's granted access to the
view and the function to some other user, presumably he doesn't mind
them being used together?  Or is that too optimistic?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Heikki Linnakangas

On 04/06/10 17:33, Tom Lane wrote:

Heikki Linnakangas  writes:

On 04/06/10 07:57, Tom Lane wrote:

The proposal some time back in this thread was to trust all built-in
functions and no others.



I thought I debunked that idea already
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
all built-in functions are safe. Consider casting integer to text, for
example.


(I meant "text to integer", of course)


Maybe the entire idea is unworkable.  I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?


Let's consider b-tree operators for an index on the secure table, for 
starters. Surely a b-tree index comparison operator can't throw an error 
on any value that's in the table already, you would've gotten an error 
trying to insert that.


Now, is it safe to expand that thinking to b-tree operators in general, 
even if there's no such index on the table? I'm not sure. But indexable 
operations are what we care about the most; the order of executing those 
determines if you can use an index scan or not.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Marc Munro
On Fri, 2010-06-04 at 10:33 -0400, Tom Lane wrote:
> Hmm ... that's a mighty interesting example, because it shows that any
> well-meaning change in error handling might render seemingly-unrelated
> functions "unsafe".  And we're certainly not going to make error
> messages stop showing relevant information just because of this.

Although that looks like a show-stopper, I think it can be worked
around.  Errors in operations on security views could simply be caught
and conditionally rewritten.  The original error could still appear in
the logs but the full details need not be reported to unprivileged
users.

If that can be done, then we would still need to be able to identify
trusted functions and views used for security purposes, and ensure that
(please excuse my terminology if it is incorrect) untrusted quals do not
get pushed down inside secured views.  If all of that can be done along
with the error trapping, then we may have a solution.

My big concern is still about performance, particularly when joining
between multiple security views and other objects.  I don't expect to
get security for free but I don't want to see unnecessary barriers to
optimisation.

__
Marc


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Tom Lane
Heikki Linnakangas  writes:
> On 04/06/10 07:57, Tom Lane wrote:
>> The proposal some time back in this thread was to trust all built-in
>> functions and no others.

> I thought I debunked that idea already 
> (http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not 
> all built-in functions are safe. Consider casting integer to text, for 
> example. Seems innocent at first glance, but it's not; if the input is 
> not a valid integer, it throws an error which contains the input string, 
> revealing it.

Hmm ... that's a mighty interesting example, because it shows that any
well-meaning change in error handling might render seemingly-unrelated
functions "unsafe".  And we're certainly not going to make error
messages stop showing relevant information just because of this.

Maybe the entire idea is unworkable.  I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Heikki Linnakangas

On 04/06/10 07:57, Tom Lane wrote:

KaiGai Kohei  writes:

(2010/06/04 11:55), Robert Haas wrote:

A (very) important part of this problem is determining which quals are
safe to push down.


At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(


The proposal some time back in this thread was to trust all built-in
functions and no others.


I thought I debunked that idea already 
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not 
all built-in functions are safe. Consider casting integer to text, for 
example. Seems innocent at first glance, but it's not; if the input is 
not a valid integer, it throws an error which contains the input string, 
revealing it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread KaiGai Kohei

(2010/06/04 18:26), Dimitri Fontaine wrote:

Tom Lane  writes:

The proposal some time back in this thread was to trust all built-in
functions and no others.  That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort.  When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.


What about trusting all "internal" and "C" language function instead? My
understanding is that "internal" covers built-in functions, and as you
need to be a superuser to CREATE a "C" language function, surely you're
able to accept that by doing so you get to trust it?

How useful would that be?


If we trust all the "C" language functions, it also means DBA can never
install any binary functions having side-effect (e.g, pg_file_write() in
the contrib/adminpack ) without security risks.

If we need an intelligence to identify what functions are trusted and
what ones are untrusted, it will eventually need a hint to mark a certain
function as trusted, won't it?

Thanks,
--
KaiGai Kohei 

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Dimitri Fontaine
Tom Lane  writes:
> The proposal some time back in this thread was to trust all built-in
> functions and no others.  That's a bit simplistic, no doubt, but it
> seems to me to largely solve the performance problem and to do so with
> minimal effort.  When and if you get to a solution that's committable
> with respect to everything else, it might be time to think about
> more flexible answers to that particular point.

What about trusting all "internal" and "C" language function instead? My
understanding is that "internal" covers built-in functions, and as you
need to be a superuser to CREATE a "C" language function, surely you're
able to accept that by doing so you get to trust it?

How useful would that be?
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
(2010/06/04 13:57), Tom Lane wrote:
> KaiGai Kohei  writes:
>> (2010/06/04 11:55), Robert Haas wrote:
>>> A (very) important part of this problem is determining which quals are
>>> safe to push down.
>>>
>> At least, I don't have an idea to distinguish trusted functions from
>> others without any additional hints, because we support variable kind
>> of PL languages. :(
> 
> The proposal some time back in this thread was to trust all built-in
> functions and no others.  That's a bit simplistic, no doubt, but it
> seems to me to largely solve the performance problem and to do so with
> minimal effort.  When and if you get to a solution that's committable
> with respect to everything else, it might be time to think about
> more flexible answers to that particular point.
> 
Although I've not check all the functions within pg_proc.h, it seems to
me reasonable assumptions, except for a small number of exception which
have side-effects, such as lo_write().

Maybe, we have to shut out a small number of exceptions.
However, at least, it seems to me reasonable assumption in this stage.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread Tom Lane
KaiGai Kohei  writes:
> (2010/06/04 11:55), Robert Haas wrote:
>> A (very) important part of this problem is determining which quals are
>> safe to push down.
>> 
> At least, I don't have an idea to distinguish trusted functions from
> others without any additional hints, because we support variable kind
> of PL languages. :(

The proposal some time back in this thread was to trust all built-in
functions and no others.  That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort.  When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
(2010/06/04 11:55), Robert Haas wrote:
> On Thu, Jun 3, 2010 at 1:23 PM, Marc Munro  wrote:
>> On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
>> wrote:
>>> [ . . . ]
>>>
>>> In my current idea, when a qual-node that contains FuncExpr tries to
>>> reference a part of relations within a security view, its referencing
>>> relations will be expanded to whole of the security view at
>>> distribute_qual_to_rels().
>>> [ . . . ]
>>
>> I may be missing something here but this seems a bit too simplistic and,
>> I think, fails to deal with an important use case.
> 
> If anything, you're putting it mildly.  This is quite a bit too
> simplistic and fails to deal with several important issues, at least
> some of which have already been mentioned on this thread.
> 
Hmm. Was it too simplified even if just a proof of concept?

>> The optimiser ought to be able to spot the fact that i_can_see() need
>> only be called once for each joined result.  By placing a barrier (if I
>> understand your proposal correctly) between the outermost joins and the
>> inner views, doesn't this optimisation become impossible?
>>
>> I think a simpler solution may be possible here.  If you can tag the
>> function i_can_see() as a security function, at least in the context of
>> its use in the security views, and then create the rule that security
>> functions are always considered to be lower cost than user-defined
>> non-security functions, don't we achieve the result of preventing the
>> insecure function from seeing rows that it shouldn't?
> 
> So, yes and no.  You DO need a security barrier between the view and
> the rest of the query, but if a function can be trusted not to do evil
> things, then it should be allowed to be pushed down.  What we need to
> prevent is the pushdown of untrusted functions (or operators).  A
> (very) important part of this problem is determining which quals are
> safe to push down.
> 
At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(

An idea is to add TRUSTED (or other) keyword for CREATE FUNCTION to mark
this function is harmless to execute, but only DBA can define functions
with the option.
(Perhaps, most of built-in functions should be marked as trusted?)

If we should identify a function as either trusted or untrusted, is
there any other ideas?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
I fixed up the subject.

(2010/06/04 2:23), Marc Munro wrote:
> On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
> wrote:
>> [ . . . ]
>>
>> In my current idea, when a qual-node that contains FuncExpr tries to
>> reference a part of relations within a security view, its referencing
>> relations will be expanded to whole of the security view at
>> distribute_qual_to_rels().
>> [ . . . ]
> 
> I may be missing something here but this seems a bit too simplistic and,
> I think, fails to deal with an important use case.
> 
> Security views, that do anything useful at all, tend to introduce
> performance issues.  I am concerned that placing a conceptual barrier
> between the secured and unsecured parts of queries is going to
> unnecessarily restrict what the optimiser can do.
> 
> For example consider that we have three secured views, each of the form:
> 
>create view s_x as select * from x where i_can_see(x.key);
> 
> and consider the query:
> 
>select stuff from s_x
>  inner join s_y on s_y.key = s_x.key
>  inner join s_z on s_z.key = s_x.key
>where fn(s_x.a) = 3;
> 
> The optimiser ought to be able to spot the fact that i_can_see() need
> only be called once for each joined result.  By placing a barrier (if I
> understand your proposal correctly) between the outermost joins and the
> inner views, doesn't this optimisation become impossible?
> 
It seems to me incorrect.

If i_can_see() can filter a part of tuples within table: x, the optimizer
prefers to call i_can_see() on scanning each tables rather than result of
join, because it effectively reduce the size of scan.

In fact, the existing optimizer make the following plan:

  postgres=# CREATE FUNCTION i_can_see(int) RETURNS bool IMMUTABLE
  AS 'begin return $1 % 1 = 1; end;' LANGUAGE 'plpgsql';
  CREATE FUNCTION
  postgres=# CREATE VIEW v1 as select * from t1 where i_can_see(a);
  CREATE VIEW
  postgres=# CREATE VIEW v2 as select * from t2 where i_can_see(x);
  CREATE VIEW
  postgres=# CREATE VIEW v3 as select * from t3 where i_can_see(s);
  CREATE VIEW

  postgres=# EXPLAIN SELECT * FROM (v1 JOIN v2 ON v1.a=v2.x) JOIN v3 on 
v1.a=v3.s;
QUERY PLAN
  
---
   Nested Loop  (cost=0.00..860.47 rows=410 width=108)
 ->  Nested Loop  (cost=0.00..595.13 rows=410 width=72)
   ->  Seq Scan on t1  (cost=0.00..329.80 rows=410 width=36)
 Filter: i_can_see(a)
   ->  Index Scan using t2_pkey on t2  (cost=0.00..0.63 rows=1 width=36)
 Index Cond: (t2.x = t1.a)
 Filter: i_can_see(t2.x)
 ->  Index Scan using t3_pkey on t3  (cost=0.00..0.63 rows=1 width=36)
   Index Cond: (t3.s = t1.a)
   Filter: i_can_see(t3.s)
  (10 rows)

Sorry, I may miss something your point.

> I think a simpler solution may be possible here.  If you can tag the
> function i_can_see() as a security function, at least in the context of
> its use in the security views, and then create the rule that security
> functions are always considered to be lower cost than user-defined
> non-security functions, don't we achieve the result of preventing the
> insecure function from seeing rows that it shouldn't?
> 
Sorry, it seems to me you mixed up different issues.

My patch tries to tackle the problem that optimizer distributes outermost
functions into inside of the view when the function takes arguments only
from a side of join, even if security views.
This issue is independent from cost of functions.

On the other hand, when a scan plan has multiple qualifiers to filter
fetched tuples, we have to pay attention on the order to be called.
If outermost functions are called prior to view's policy functions,
it may cause unexpected leaks.

In my opinion, we should consider the nestlevel of the function where
it is originally put. But it is not concluded yet, and the patch does
not tackle anything about this issue.

> I guess my concern is that a query may be constructed a=out of secured
> and unsecured parts and the optimiser should be free to group all of the
> secured parts together before considering the unsecured parts.
> 
Yes, it is an opinion, but it may cause unnecessary performance regression
when user given condition is obviously harmless.

E.g, If table: t has primary key t.a, and a security view is defined as:

  CREATE VIEW v AS SELECT * FROM t WHERE f_policy(t.b);

When user gives the following query:

  SELECT * FROM v WHERE a = 100;

If we strictly separate secured and unsecure part prior to optimizer,
it will break a chance to scan the table: t with index.

I also think security is not free, so it may take a certain level of
performance degradation. But it should not bring down more than necessity.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mail

[HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
The attached patch is a proof of concept.
It tries to fix the problem of leaky VIEWs for RLS.

* The scenario of leaky VIEWs for RLS
-
When a view contains any table joins and user gives a function that
takes arguments depending on only one-side table of the joins, the
planner tries to distribute the qualifier into least unit of scans.
It enables to minimize the scale of execution cost, but it means
user given function may be invoked earlier than other qualifiers
within a view, although row-level security is intended using views.
If the user given function has a side-effect (E.g, it may try to
insert given arguments into other tables), this behavior allows us
to leak contents of invisible tuples.

  postgres=# SELECT * FROM v2 WHERE f_malicious(y);
  NOTICE:  f_malicious: xxx
  NOTICE:  f_malicious: yyy <-- leaky contents
  NOTICE:  f_malicious: zzz
   a |  b  | x |  y
  ---+-+---+-
   1 | aaa | 1 | xxx
   3 | ccc | 3 | zzz
  (2 rows)


* Discussions in -hackers
-
Although it is a security problem, we will face unignorable performance
regression, if we disallow to distribute all user defined functions to
inside of joins come from views originally.

One idea is to distinguish a view into two types. The one is security
oriented view, and the other is regular view. Because only creator of
the view knows purpose of the view, an idea was suggested that takes
a hint on CREATE VIEW, like CREATE SECURITY VIEW.
(But I don't implement this feature in this patch yet.)

In addition, I was pointed out it may be harmless as long as
a person executes the query (not view's owner) has enough privileges
on underlaying tables, even if a view is marked as security view.

However, it was also pointed out the idea needs to check privileges
on planner stage, not execution stage. Is it really preferable?
It is not concluded yet. But inline_set_returning_function() checks
ACL_EXECUTE permission on planner stage, then it makes a FuncExpr
flatten prior to optimization. At least, it seems to me something
violation to the dogma.

Please point out any issues which I missed.


* About this patch
--
This patch mainly consists of two parts.

The first part is at pull_up_simple_subquery().
It checks privileges of underlaying tables in the subquery to be pulled
up, then it marks FromExpr->security_view as TRUE.

Note that it does not distinct a subquery come from a view and a pure
subquery in the original query. But it eventually causes access violation
error, if current user does not have privileges on a pure subquery.
So, no matter.

And, also note that it does not have statement support right now.
So, it assumes all the subqueries are possibly security views.

The second part is at distribute_qual_to_rels().
It computes what relations does the given clause depend on at first.
If the given clause depends on a part of relations to be joined
inside of the security view, it expands the dependency of the clause
into whole of the security view.
In the result, the clause is not distributed into inside of the join
loop.

(*) This patch uses check_relation_privileges() to check permissions
on DML execution, so please apply another my patch also on testing.

* Execution result
--

  postgres=# CREATE TABLE t1 (a int primary key, b text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for 
table "t1"
  CREATE TABLE
  postgres=# CREATE TABLE t2 (x int primary key, y text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for 
table "t2"
  CREATE TABLE
  postgres=# CREATE TABLE t3 (s int primary key, t text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t3_pkey" for 
table "t3"
  CREATE TABLE
  postgres=# CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x;
  CREATE VIEW
  postgres=# CREATE VIEW v2 AS SELECT * FROM v1 JOIN t3 ON v1.a = t3.s;
  CREATE VIEW
  postgres=# CREATE USER tak;
  CREATE ROLE
  postgres=# GRANT SELECT ON v1, v2, t3 TO tak;
  GRANT

The user: tak is allowed to select v1, v2 and t3.

  postgres=# CREATE OR REPLACE FUNCTION f_malicious(text) RETURNS bool
  AS 'BEGIN RAISE NOTICE ''f_malicious: %'', $1; RETURN true; END;'
  LANGUAGE plpgsql;
  CREATE FUNCTION

Because superuser can access t1 and t2 directly, f_malicious(y) shall be
distributed to Seq-scan on t2.

  postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(y);
  QUERY PLAN
  ---
   Hash Join  (cost=334.93..365.94 rows=410 width=72)
 Hash Cond: (t1.a = t2.x)
 ->  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
 ->  Hash  (cost=329.80..329.80 rows=410 width=36)
   ->  Seq Scan on t2  (cost=0.00..329.80 rows=410 width=36)
 Filter: f_malicious(y)
  (6 rows)

But tak cannot access t1 and t2 directly, f_malicious(y) shall be
distributed to outside of the