Re: Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Tom Lane
Bryn Llewellyn  writes:
> I've seen this pattern in use:

>   create temp table if not exists pg_temp.flag(val boolean not null) on 
> commit delete rows;
>   insert into pg_temp.flag(val) values(true);

> But doing a DDL before every use of the session-state representation felt 
> heavier than assuming that it's there and creating the table only if it 
> isn't. But I haven't done any timing tests. Is the "create… if not exists" so 
> lightweight when the to-be-created object does exist that I'm fussing over 
> nothing?

Fair question.  My gut feeling is that the subtransaction created by the
BEGIN ... EXCEPTION construct is more expensive than a no-op CREATE
IF NOT EXISTS.  I've not measured it though; and I'm pretty sure that
the answer would vary depending on how often you expect the code to fall
through versus needing to create the table.

regards, tom lane




Re: Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Bryn Llewellyn
> t...@sss.pgh.pa.us wrote:
> 
>> b...@yugabyte.com writes:
>> 
>>  get stacked diagnostics msg = message_text;
>>  if msg != 'relation "pg_temp.flag" does not exist' then
> 
> This is pretty fragile --- eg, it will fall over with translated messages. I 
> think you could presume that if the error condition name is undefined_table 
> then you know what to do.

Mea culpa. I should have stressed that my code was meant to be a sketch rather 
than the real thing. So my use of "on commit delete rows" suits the "hard shell 
paradigm" that I described here:

https://www.postgresql.org/message-id/f0a23614-749d-4a89-84c5-119d4000f...@yugabyte.com

where the client code does:

  check out connection
  call a user-defined API subprogram
  release connection

and where I therefore want automatic check-out-duration session state.

In a different use case, I'd want session-duration session state. There. I'd 
use "on commit preserve rows".

About testing what "message_text" from "get stacked diagnostics msg" returns, 
yes… of course its sensitivity to the current choice of national language is a 
non-starter. I don't like to assume more than I have to. So I might say this:

  if msg !~ '"pg_temp.flag"' then

But, then again, I might decide that it's just too fussy.

I've seen this pattern in use:

  create temp table if not exists pg_temp.flag(val boolean not null) on commit 
delete rows;
  insert into pg_temp.flag(val) values(true);

But doing a DDL before every use of the session-state representation felt 
heavier than assuming that it's there and creating the table only if it isn't. 
But I haven't done any timing tests. Is the "create… if not exists" so 
lightweight when the to-be-created object does exist that I'm fussing over 
nothing?



Re: Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Tom Lane
Bryn Llewellyn  writes:
>   get stacked diagnostics msg = message_text;
>   if msg != 'relation "pg_temp.flag" does not exist' then

This is pretty fragile --- eg, it will fall over with translated
messages.  I think you could presume that if the error condition
name is undefined_table then you know what to do.

regards, tom lane




Re: Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Bryn Llewellyn
> t...@sss.pgh.pa.us wrote:
> 
> Up to now, there's been an intentional policy of not documenting
> 
> «
> 20.16. Customized Options
> https://www.postgresql.org/docs/14/runtime-config-custom.html 
> 
> »
> 
> very prominently[*], because doing so would encourage people to abuse such 
> variables as application state variables. I say "abuse" because the code 
> supporting such variables isn't really designed to support lots of them.

I hinted at a different approach in an earlier turn in this thread:

https://www.postgresql.org/message-id/35254b0b-6501-4cf6-a13f-76d03756c...@yugabyte.com

I sketched only how you might handle the case where the session state is just a 
single value—by using a one-row, one-column temporary table with "on commit 
delete rows". But the general approach is to use a two column temporary table 
for key-value pairs. This approach is what the PG doc sketches here:

«
43.13. Porting from Oracle PL/SQL
https://www.postgresql.org/docs/current/plpgsql-porting.html
Since there are no packages, there are no package-level variables either. This 
is somewhat annoying. You can keep per-session state in temporary tables 
instead.
»

(That article of faith, "there are no packages and there never, ever will be", 
saddens me.)

Because PG has no event trigger that fires on session creation (why is this?), 
I've settled on this optimistic pattern:

begin
  insert into pg_temp.flag(val) values(true);
exception when undefined_table then
  get stacked diagnostics msg = message_text;
  if msg != 'relation "pg_temp.flag" does not exist' then
raise;
  else
create temp table pg_temp.flag(val boolean not null) on commit delete rows;
insert into pg_temp.flag(val) values(true);
  end if;
end;

The code would need to be more elaborate (and use "upsert") for key-value 
pairs. But that's easy to do.

Do the experts on this list disapprove of this pattern and prefer (for a future 
regime) something like the Pavel Stehule scheme that Tom mentioned?

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

[*] I didn't know that there was a PG doc policy sometimes not to call out a 
bad practice but, rather, to hide away (in an obscure backwater in the docs) 
the account of a feature that’s considered to be better avoided except in 
special cases. This effectively hides it from Google search (and similar) too 
because of the circular notion that few people find it, and fewer still publish 
pages that include the link,… and so on.

I suppose that calling the thing an "option" while the doc for the "set" SQL 
statement uses the term of art "run-time parameter" is another “bad practice 
admonition by obscurity” notion. (I've referred to the thing as a "user-defined 
run-time parameter" in informal emails to colleagues. But that is a lot of 
syllables.)

Re: Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Tom Lane
"Peter J. Holzer"  writes:
> On 2022-10-01 20:24:21 +0800, Julien Rouhaud wrote:
>> This is intentional, and documented at
>> https://www.postgresql.org/docs/14/runtime-config-custom.html.

> So it's there but hard to find (I did read the table of contents at
> https://www.postgresql.org/docs/14/runtime-config.html but I guess after
> all the subchapters about specific configuration parameters I didn't
> notice the one generic subchapter near the end).
> I think referring to that chapter from
> https://www.postgresql.org/docs/14/sql-set.html and possibly other
> places would help.

Up to now, there's been an intentional policy of not documenting
this very prominently, because doing so would encourage people
to abuse such variables as application state variables.  I say
"abuse" because the code supporting such variables isn't really
designed to support lots of them.

There has been a patch in the works for a long time to provide a
better mechanism for application state variables [1].  Separately,
I did some work recently towards alleviating the performance
problems with lots of custom parameters [2].  It appears from
the discussion in [2] that we're probably going to stick to the
policy of not encouraging use of custom parameters as application
state, although if [1] keeps failing to land maybe that will change?

regards, tom lane


[1] 
https://www.postgresql.org/message-id/flat/CAFj8pRD053CY_N4=6SvPe7ke6xPbh=k50luaowjc3jm8me9...@mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/2982579.1662416...@sss.pgh.pa.us




Re: Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Peter J. Holzer
On 2022-10-01 20:24:21 +0800, Julien Rouhaud wrote:
> On Sat, Oct 01, 2022 at 02:05:53PM +0200, Peter J. Holzer wrote:
> > On 2022-09-30 17:59:01 -0700, Bryn Llewellyn wrote:
> > > set rls.tenant_id=42;
> >
> > This works because there is a "." in the name. Without the "."
> > PostgreSQL complains:
> >
> > hjp=> set rls_tenant_id=42;
> > ERROR:  unrecognized configuration parameter "rls_tenant_id"
> >
> > I think I sort of knew this but had forgotten about it, so checked the
> > manual for the exact rules. Unfortunately I couldn't find them
[...]
> > From the documentation I get the impression that you can only set
> > existing parameters, not your own.
> >
> > I therefore suggest adding something like this to the section
> > "configuration_parameter" in
> > https://www.postgresql.org/docs/14/sql-set.html:
> >
[...]
> > (This of course assumes that the behaviour is intentional and not a
> > bug.)
> 
> This is intentional, and documented at
> https://www.postgresql.org/docs/14/runtime-config-custom.html.

So it's there but hard to find (I did read the table of contents at
https://www.postgresql.org/docs/14/runtime-config.html but I guess after
all the subchapters about specific configuration parameters I didn't
notice the one generic subchapter near the end).

I think referring to that chapter from
https://www.postgresql.org/docs/14/sql-set.html and possibly other
places would help.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Julien Rouhaud
Hi,

On Sat, Oct 01, 2022 at 02:05:53PM +0200, Peter J. Holzer wrote:
> On 2022-09-30 17:59:01 -0700, Bryn Llewellyn wrote:
> > set rls.tenant_id=42;
>
> This works because there is a "." in the name. Without the "."
> PostgreSQL complains:
>
> hjp=> set rls_tenant_id=42;
> ERROR:  unrecognized configuration parameter "rls_tenant_id"
>
> I think I sort of knew this but had forgotten about it, so checked the
> manual for the exact rules. Unfortunately I couldn't find them (I
> checked https://www.postgresql.org/docs/14/sql-set.html,
> https://www.postgresql.org/docs/14/config-setting.html,
> https://www.postgresql.org/docs/14/functions-admin.html#FUNCTIONS-ADMIN-SET,
> and browser the table of content and the index).
>
> From the documentation I get the impression that you can only set
> existing parameters, not your own.
>
> I therefore suggest adding something like this to the section
> "configuration_parameter" in
> https://www.postgresql.org/docs/14/sql-set.html:
>
> | In addition, any name including a single dot (".") can be set. This
> | allows an application to define its own run-time parameters. Using the
> | application name as a prefix reduces the probability of name conflicts
> | with other applications or extensions.
>
> (This of course assumes that the behaviour is intentional and not a
> bug.)

This is intentional, and documented at
https://www.postgresql.org/docs/14/runtime-config-custom.html.




Names of run-time configuration parameters (was: Limiting the operations that client-side code can perform upon its database backend's artifacts)

2022-10-01 Thread Peter J. Holzer
On 2022-09-30 17:59:01 -0700, Bryn Llewellyn wrote:
> set rls.tenant_id=42;

This works because there is a "." in the name. Without the "."
PostgreSQL complains:

hjp=> set rls_tenant_id=42;
ERROR:  unrecognized configuration parameter "rls_tenant_id"


I think I sort of knew this but had forgotten about it, so checked the
manual for the exact rules. Unfortunately I couldn't find them (I
checked https://www.postgresql.org/docs/14/sql-set.html,
https://www.postgresql.org/docs/14/config-setting.html,
https://www.postgresql.org/docs/14/functions-admin.html#FUNCTIONS-ADMIN-SET,
and browser the table of content and the index).

From the documentation I get the impression that you can only set
existing parameters, not your own.

I therefore suggest adding something like this to the section
"configuration_parameter" in
https://www.postgresql.org/docs/14/sql-set.html:

| In addition, any name including a single dot (".") can be set. This
| allows an application to define its own run-time parameters. Using the
| application name as a prefix reduces the probability of name conflicts
| with other applications or extensions.

(This of course assumes that the behaviour is intentional and not a
bug.)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-10-01 Thread Peter J. Holzer
On 2022-09-30 17:59:01 -0700, Bryn Llewellyn wrote:
> hjp-pg...@hjp.at wrote:
> b...@yugabyte.com wrote:
> Paraphrasing Peter, the design of the application's RDBMS backend has
> to implement its own notions of roles and privileges as a new layer on
> top of whatever the native RDBMS mechanisms provide. Some RDBMSs have
> native primitives that help the implementation of this next,
> application-specific, roles-and-privileges regime.
> 
> Can you provide a link to documentation (hopefully high-level and concise)
> of such a system? I'm having a hard time imagining one which I wouldn't
> either classify as "database roles by another name" or "just data".
> 
> 
> I fear that we might be about to start another round of mutual
> misunderstanding. I’m not confident that I understand the question.
> 
> I believe that you want x-refs to accounts of native RDBMS features that let
> you implement application-level specific notions of security on top of the
> basic roles and privileges features and that are oriented to the case where a
> single RDBMS role, "client", must expose differentiated functionality to
> different human end-users—where these principals are identified by, and
> authorized by, system(s) outside of the RDBMS in question.

I the context of this thread, I don't "want" anything (what I actually
want may change from project to project). I am not familiar with the
"native primitives" you mentioned, so I would like to read up on them.
So if you can just drop a few names I can feed them to my favourite
search engine. An overview article which explains the concept and how
the primitives are used would be better of course, but if you don't have
any at hand, no problem.

Background: I have used Oracle, MySQL/MariaDB and PostgreSQL enough that
I consider myself to be quite familiar with their capabilities (my
Oracle skills are getting rusty, though). I have occasionally used
and/or read about other RDBMSs, but my knowledge of those is very
spotty.


> Franck's PG RLS  policies are created like this:
> 
> create policy... on ... for... using(tenant_id=current_setting
> ('rls.tenant_id')::int);

So, I think the intermediate concept here which is neither role nor data
is the use of a run-time parameter.

He's not using a database role and he's not using a parameter which has
to be supplied to every query by the application programmer, but a
run-time parameter which would presumably be set once at the beginning
of a session or transaction (depending on whether you use connection
pooling). That's clever. Not sure if I would actually use it but it's
certainly something I'll add to my bag of tools.

> I'm afraid that I don't know what you mean by « classify as "database roles by
> another name" or "just data" ». For me, the RLS scheme is simply the native PG
> feature that you use to get the result that you want.

I meant what do you use to identify the user. The mechanism in
PostgreSQL intended for this are roles. You can use roles in RLS (in
fact I don't think I've ever seen an example which didn't use roles).

In contrast to this many systems (e.g. Django, Typo3, several I've
designed myself) use a table to keep their own list of users (and
possibly groups), and then check for every access whether the user
logged into the application has access by joining with that table. That
join may be explicit in the application code or it may be hidden behind
a view or a function. But the database doesn't know about that user.
It's just another parameter passed in by the application - "just data".

Using a run-time parameter is somewhere in between. The database still
doesn't know what it means (so in some sense it's still "just data"),
but it will keep the value for the duration of the transaction or
session, so the application gets sort of an "authorize once, then forget
about it" abstraction which puts it closer to the "roles by another
name" camp.


> Platform Multitenant Architecture
> https://architect.salesforce.com/fundamentals/platform-multitenant-architecture

That's *too* high-level for me. There's any number of techniques which
could be used to implement something like that and I don't see how they
actually did it (maybe I missed it - I admit I only skimmed the
article).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-30 Thread Bryn Llewellyn
> hjp-pg...@hjp.at wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>> Paraphrasing Peter, the design of the application's RDBMS backend has to 
>> implement its own notions of roles and privileges as a new layer on top of 
>> whatever the native RDBMS mechanisms provide. Some RDBMSs have native 
>> primitives that help the implementation of this next, application-specific, 
>> roles-and-privileges regime.
> 
> Can you provide a link to documentation (hopefully high-level and concise) of 
> such a system? I'm having a hard time imagining one which I wouldn't either 
> classify as "database roles by another name" or "just data".

I fear that we might be about to start another round of mutual 
misunderstanding. I’m not confident that I understand the question.

I believe that you want x-refs to accounts of native RDBMS features that let 
you implement application-level specific notions of security on top of the 
basic roles and privileges features and that are oriented to the case where a 
single RDBMS role, "client", must expose differentiated functionality to 
different human end-users—where these principals are identified by, and 
authorized by, system(s) outside of the RDBMS in question. In such a world, the 
RDBMS gets a suitable ID for the human (by all means suitably encrypted) and 
can use this to advantage by consulting its own representation of the current 
human's identity.

If my guess about your question is wrong, then I'm wasting my time. But I'll 
carry on anyway.

This is the canonical example:

—A manager must be able to see the salaries of all employees in the reporting 
hierarchy under her/him—but not be able to see salaries outside of that subtree.

Imagine the usual "employees" table with the pig's ear "manager_id" FK. But add 
an extra column for each employees external-to-the-database unique ID (unless 
this is already used as the table's PK).

PG has a native feature for this: row level security (hereinafter RLS). You can 
Google for blog posts about this use case. Here's one by a colleague of mine, 
Franck:

https://dev.to/yugabyte/hyper-scale-multi-tenant-for-saas-an-example-with-pgbench-33a3
 


It uses the multi-tenant use case rather than my "managers-and-salaries" one. 
But the principles are what matter. Never mind the detail. The critical 
table(s) are striped with "tenant_id". And this is passed from the application 
tier to the PG tier with:

set rls.tenant_id=42;

Using the "hard shell" paradigm, "rls.tenant_id" could be a formal argument to 
an API subprogram. And its within-database representation could be a one 
column, one row temporary table with "on commit delete rows".

Franck's PG RLS  policies are created like this:

create policy... on ... for... 
using(tenant_id=current_setting('rls.tenant_id')::int);

The argument of "using()" is just an arbitrary SQL expression that evaluates to 
a boolean.

https://www.postgresql.org/docs/current/sql-createpolicy.html

So in the "managers-and-salaries" use case, it could be "employee_id in ()".

I'm afraid that I don't know what you mean by « classify as "database roles by 
another name" or "just data" ». For me, the RLS scheme is simply the native PG 
feature that you use to get the result that you want. But the context of these 
examples is very definitely a connection pooling regime that uses the single 
role "client" on behalf of (very) many different human users.

Finally, in case your question sought an account of some real-world scheme in 
this general use case space, a bit of Googling for  « Salesforce multi-tenancy 
» got me this:

Platform Multitenant Architecture
https://architect.salesforce.com/fundamentals/platform-multitenant-architecture 


It's public knowledge that they use Oracle Database (or at least did—I haven't 
thought about them recently). But the general principles have their equivalents 
in PG too.

Salesforce engineers have also talked at Oracle User group conferences about 
how they use PL/SQL. Their main multi-tenant implementation (at least at the 
time of these talks) followed the "hard shell" paradigm.

Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-30 Thread Peter J. Holzer
On 2022-09-29 20:24:59 -0700, Bryn Llewellyn wrote:
> Paraphrasing Peter, the design of the application's RDBMS backend has to
> implement its own notions of roles and privileges as a new layer on top of
> whatever the native RDBMS mechanisms provide. Some RDBMSs have native
> primitives that help the implementation of this next, application-specific,
> roles-and-privileges regime.

Can you provide a link to documentation (hopefully high-level
and concise) of such a system? I'm having a hard time imagining one
which I wouldn't either classify as "database roles by another name" or
"just data".

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-29 Thread Bryn Llewellyn
> t...@sss.pgh.pa.us wrote:
> 
>> b...@yugabyte.com writes:
>> 
>>> rjuju...@gmail.com wrote:
>>> 
>>> I'm not convinced... that the authorization system can prevent an untrusted 
>>> user with a direct SQL access from actually hurting you.
>> 
>> What do you mean by "untrusted"? Any person who is given the credentials to 
>> start a database session is trusted—even a person who can connect as a 
>> superuser and do untold harm. So focus on a person who has the credentials 
>> to connect as "client" in my example. But imagine a design that exposes 
>> functionality to "client" sessions exclusively through a carefully designed 
>> and implemented API that's expressed exclusively with user-defined functions 
>> and procedures.
> 
> Sure. That is called an application server. What we are is a SQL server, and 
> that means that the API is SQL commands, and the authorization model is what 
> the SQL spec says it is. [So]... any session authenticated as user X has the 
> same privileges as any other session authenticated as user X, so there is not 
> a lot of point in user X mounting defenses against user X. So I think the 
> concerns you're expressing here would be better addressed at the next level 
> up the stack.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Julien replied to the same message from me that Tom did thus:

> rjuju...@gmail.com wrote 
> 
> You mean like if the application takes care of checking that the logged-in 
> user is allowed to insert data based on whatever application defined rules / 
> user profile, while the SQL role can simply insert data and/or call the 
> carefully written functions? Yes the data will be consistent, but if your 
> role just transferred money from an account to another that's not really the 
> problem.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Peter Holzer replied to an earlier message fro be in this branching thread thus:

> hjp-pg...@hjp.at wrote:
> 
> I think you did [misunderstand Tom]. What he was saying was that a database 
> design which uses only one single role which both owns all the objects and 
> executes all code on behalf of the user can not be secure. At the very least 
> you need two roles: One which owns the objects and one which can only use the 
> objects in a way allowed by the business logic and is not allowed to change 
> that logic (This is the case you demonstrated.)

Yes, indeed it is. That was my aim.

> In many cases this should be even more fine-grained, and at the extreme end 
> every user could actually have several roles, each with only the minimal 
> privileges required.
> 
> (I have often observed that we tend to throw away and build permission 
> systems at every layer of the stack: The OS has a system of users and 
> permissions. But the database server runs as a single user (postgres) which 
> has access to all the data files. So it has to implement its own system of 
> roles and permissions. Then an application developer comes along and writes 
> an app which uses a single database role which has access to all the data.

It often is done like this. But it doesn't have to be done this way. This is 
where a user-defined within-RDBMS functions and procedures, and so-called "run 
authority" in the wider discourse, come to the rescue.

> So it again has to implement its own system of users and permissions...)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

*SUMMARY*

I'd better simply attempt to stop further discussion in this thread by saying 
that no mutual understanding has been reached. I'm sad about this outcome. But 
it would seem to be unfair to invite anybody to spend any more time on these 
exchanges.

*MORE... ONLY IF YOU CAN BEAR TO READ IT*

First, lest a really awful misunderstanding lingers on, I must stress that, 
about this from Tom:

> any session authenticated as user X has the same privileges as any other 
> session authenticated as user X


well, yes. That statement of outcome is a tautology. I don't think that I wrote 
anything to say that I thought otherwise. About the notion that Ms. Jones, 
authorized as X can kill Ms. Smith's session authorized as X, and vice versa... 
this seems now to be just a red herring. Either the designer wants this or they 
 don't. And if they don't, they can prevent it by a simple revocation of one 
object privilege. It's been argued that some other evil doing can't be 
prevented. Well, fair enough. But this doesn't seem to justify not preventing 
any bad thing that *can* be prevented. 

There was an era, now long gone, when an outfit had its own production RDBMS 
(or a small few of these) in a locked and guarded room in the very same 
building within which all human end users of such a system worked. In those 
days, each human user was mapped one-to-one to its own within-RDBMS role. And 
the application under the end-users fingertips connected directly to the RDBMS 
by authentic

Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-28 Thread Peter J. Holzer
On 2022-09-27 19:29:39 -0700, Bryn Llewellyn wrote:
> I interpreted what Tom wrote to mean that he flatly rejected the idea
> that a database design was possible that prevented a client session
> that authorized as a role, that's designed for that purpose, from
> dropping tables and otherwise arbitrarily corrupting stuff. I expect
> that I completely misunderstood his point.

I think you did. What he was saying was that a database design which
uses only one single role which both owns all the objects and executes
all code on behalf of the user can not be secure. At the very least you
need two roles: One which owns the objects and one which can only use
the objects in a way allowed by the business logic and is not allowed to
change that logic (This is the case you demonstrated). In many cases
this should be even more fine-grained, and at the extreme end every user
could actually have several roles, each with only the minimal privileges
required.

(I have often observed that we tend to throw away and build permission
systems at every layer of the stack: The OS has a system of users and
permissions. But the database server runs as a single user (postgres)
which has access to all the data files. So it has to implement its own
system of roles and permissions. Then an application developer comes
along and writes an app which uses a single database role which has
access to all the data. So it again has to implement its own system of
users and permissions ...)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-28 Thread Peter J. Holzer
On 2022-09-28 09:13:53 +0800, Julien Rouhaud wrote:
> On Tue, Sep 27, 2022 at 05:27:22PM -0700, Bryn Llewellyn wrote:
> > > hjp-pg...@hjp.at wrote:
> > >> rjuju...@gmail.com wrote:
> > >>> b...@yugabyte.com wrote:
> > >>> My demo seems to show that when a program connects as "client", it can
> > >>> perform exactly and only the database operations that the database 
> > >>> design
> > >>> specified. Am I missing something? In other words, can anybody show me a
> > >>> vulnerability?
> > >>
> > >> What exactly prevents the client role from inserting e.g.
> > >>
> > >> - 'robert''); drop table students; --'
> > >
> > > It can do this but it won't do any harm since the client role doesn't have
> > > permission to drop the table.
> 
> FTR it's a reference to https://xkcd.com/327/
> 
> Both of you are saying it's harmless because you're assuming that only the
> client role may read the data and act on it, but the whole point of SQL
> injection is to try to do actions that the role you have access to can't
> already do.

No. The point of SQL injection is to get client code (typically some web
application, but could be anything not directly under the attacker's
control) to execute SQL which it wasn't designed to execute. So for
example the student management system in the cartoon may not contain a
"drop" statement anywhere in its code. But because of the SQL injection
it will execute one anyway.

This doesn't elevate the privileges of the client role. The attacker can
merely leverage privileges the client role already has (which may be
more than it strictly needs).

If the attacker has direct access to the database then SQL injection is
moot since they can just execute the SQL anyway.


> > > There are often several layers of defense. The database frequently won't 
> > > be
> > > accessible from the open internet (or even the company network) directly.
> > > Only a middle tier of application servers running vetted client code will
> > > connect directly. Even those servers may not be accessible directly to end
> > > users. There may be a layer of proxy servers above them. Each of these
> > > layers may implement additional checks, rate limits and monitoring.
> 
> If no one has direct SQL access to the database, then there's no problem with 
> a
> role being able to pg_terminate_backend() session for the same role, and this
> thread shouldn't exist to begin with.

The attacker might use SQL injection to execute pg_terminate_backend().
But yes, as Tom already pointed out, if they can do that, that's
probably the least of your worries.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Julien Rouhaud
On Tue, Sep 27, 2022 at 08:47:52PM -0700, Bryn Llewellyn wrote:
>
> For example, the "lower case only" rule was meant to be an
> example of *any* data rule. Just like the write-once-read-many auto-generated
> surrogate primary key rule. Can you show me how those data rules, unrealistic
> as you might think them to be, can be violated?

An unprivileged role by definition can't ignore or drop table constraints,
that's not the problem here.  The discussion should be "is having a direct SQL
access to the same role as my carefully written application is using
problematic", and the answer is yes.

> > I'm not convinced... that the authorization system can prevent an untrusted
> > user with a direct SQL access from actually hurting you.
> 
> What do you mean by "untrusted"? Any person who is given the credentials to
> start a database session is trusted—even a person who can connect as a
> superuser and do untold harm. So focus on a person who has the credentials to
> connect as "client" in my example. But imagine a design that exposes
> functionality to "client" sessions exclusively through a carefully designed
> and implemented API that's expressed exclusively with user-defined functions
> and procedures. And choose something to model that meets your criteria for
> realism. Then show me, using a self-contained code example, how a session
> that authorized as "client" can cause the hurt that concerns you. Notice that
> "hurt" must be taken to mean having the persistently stored data no longer
> satisfying  as specified business rule. And not anything to do with denial of
> service based on unconstrained resource consumption.

You mean like if the application takes care of checking that the logged-in user
is allowed to insert data based on whatever application defined rules / user
profile, while the SQL role can simply insert data and/or call the carefully
written functions?  Yes the data will be consistent, but if your role just
transferred money from an account to another that's not really the problem.




Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Tom Lane
Bryn Llewellyn  writes:
> rjuju...@gmail.com wrote:
>> I'm not convinced... that the authorization system can prevent an untrusted 
>> user with a direct SQL access from actually hurting you.

> What do you mean by "untrusted"? Any person who is given the credentials
> to start a database session is trusted—even a person who can connect as
> a superuser and do untold harm. So focus on a person who has the
> credentials to connect as "client" in my example. But imagine a design
> that exposes functionality to "client" sessions exclusively through a
> carefully designed and implemented API that's expressed exclusively with
> user-defined functions and procedures.

Sure.  That is called an application server.  What we are is a SQL server,
and that means that the API is SQL commands, and the authorization model
is what the SQL spec says it is.  (Well, okay, there's some discrepancies
between the letter of the spec and what we actually do.  But my main
point here is that any session authenticated as user X has the same
privileges as any other session authenticated as user X, so there is
not a lot of point in user X mounting defenses against user X.)

So I think the concerns you're expressing here would be better addressed
at the next level up the stack.

regards, tom lane




Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Bryn Llewellyn
> rjuju...@gmail.com wrote:
> 
> b...@yugabyte.com wrote:
>> 
>> Now back to my new thread. I interpreted what Tom wrote to mean that he 
>> flatly rejected the idea that a database design was possible that prevented 
>> a client session that authorized as a role, that's designed for that 
>> purpose, from dropping tables and otherwise arbitrarily corrupting stuff. I 
>> expect that I completely misunderstood his point. But, anyway, that's what I 
>> responded to.
>> 
>> Now it seems that you, Julien, are not convinced that the code that I showed 
>> prevents a session that authorizes as "client" from dropping the table, 
>> owned by "u1", where the data is. Nor are you convinced that a "client" 
>> session is prevented from inserting mixed or upper case data, updating 
>> existing data, or deleting existing data. Rather (as your Bobby Tables 
>> reference indicates) you think that a cunning SQL injection attack can 
>> manage to do these bad things.
>> 
>> Well... the challenge is yours now: prove your point with some working code.
> 
> I'm convinced that that authorization system works as expected, what I'm not 
> convinced of is that the authorization system can prevent an untrusted user 
> with a direct SQL access from actually hurting you. So yes in your case maybe 
> the "client" role cannot drop the showed table, but it can still insert 
> nonsensical data, from a client point of view, or lead to outage or other 
> problems without any difficulty, and there's nothing in the authorization 
> system that can prevent that.
> 
> I'm also not convinced that your demo is proving anything, as "inserting any 
> only value made of non-uppercase characters in a single table" isn't really 
> representative of any basic application, especially without knowing what that 
> data will be used for.
> 
> The only case this example could make sense would be a log application, and 
> then a direct SQL access you can insert nonsensical or malicious data, 
> depending on what the application will do with those data (which could lead 
> to crash in the client application, or make it do thing it shouldn't do).

My example wasn't meant in any way to be realistic. I'm sorry if I didn't make 
that clear from the outset. It was meant only to illustrate the principles. For 
example, the "lower case only" rule was meant to be an example of *any* data 
rule. Just like the write-once-read-many auto-generated surrogate primary key 
rule. Can you show me how those data rules, unrealistic as you might think them 
to be, can be violated?

> I'm not convinced... that the authorization system can prevent an untrusted 
> user with a direct SQL access from actually hurting you.

What do you mean by "untrusted"? Any person who is given the credentials to 
start a database session is trusted—even a person who can connect as a 
superuser and do untold harm. So focus on a person who has the credentials to 
connect as "client" in my example. But imagine a design that exposes 
functionality to "client" sessions exclusively through a carefully designed and 
implemented API that's expressed exclusively with user-defined functions and 
procedures. And choose something to model that meets your criteria for realism. 
Then show me, using a self-contained code example, how a session that 
authorized as "client" can cause the hurt that concerns you. Notice that "hurt" 
must be taken to mean having the persistently stored data no longer satisfying  
as specified business rule. And not anything to do with denial of service based 
on unconstrained resource consumption.

If, when I review it, I can see how to change the code to remove the 
vulnerability, then you'll have learned something. On the other hand, if you 
can show me a vulnerability that cannot be fixed, then I'll have learned 
something! I'm selfishly more interested in that second outcome because my 
overall mental model will have been improved.





Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Julien Rouhaud
On Tue, Sep 27, 2022 at 07:29:39PM -0700, Bryn Llewellyn wrote:
>
> Now back to my new thread. I interpreted what Tom wrote to mean that he
> flatly rejected the idea that a database design was possible that prevented a
> client session that authorized as a role, that's designed for that purpose,
> from dropping tables and otherwise arbitrarily corrupting stuff. I expect
> that I completely misunderstood his point. But, anyway, that's what I
> responded to.
>
> Now it seems that you, Julien, are not convinced that the code that I showed
> prevents a session that authorizes as "client" from dropping the table, owned
> by "u1", where the data is. Nor are you convinced that a "client" session is
> prevented from inserting mixed or upper case data, updating existing data, or
> deleting existing data. Rather (as your Bobby Tables reference indicates) you
> think that a cunning SQL injection attack can manage to do these bad things.
>
> Well... the challenge is yours now: prove your point with some working code.

I'm convinced that that authorization system works as expected, what I'm not
convinced of is that the authorization system can prevent an untrusted user
with a direct SQL access from actually hurting you.

So yes in your case maybe the "client" role cannot drop the showed table, but
it can still insert nonsensical data, from a client point of view, or lead to
outage or other problems without any difficulty, and there's nothing in the
authorization system that can prevent that.

I'm also not convinced that your demo is proving anything, as "inserting any
only value made of non-uppercase characters in a single table " isn't really
representative of any basic application, especially without knowing what that
data will be used for.

The only case this example could make sense would be a log application, and
then a direct SQL access you can insert nonsensical or malicious data,
depending on what the application will do with those data (which could lead to
crash in the client application, or make it do thing it shouldn't do).




Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Bryn Llewellyn
> rjuju...@gmail.com wrote:
> 
>> b...@yugabyte.com wrote:
>> 
 hjp-pg...@hjp.at wrote:
 
> rjuju...@gmail.com wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>> [Bryn] My demo seems to show that when a program connects as "client", 
>> it can perform exactly and only the database operations that the 
>> database design specified. Am I missing something? In other words, can 
>> anybody show me a vulnerability?
> 
> [Julien] What exactly prevents the client role from inserting e.g.
> 
> - 'robert''); drop table students; --'
 
 [Peter] It can do this but it won't do any harm since the client role 
 doesn't have permission to drop the table.
> 
> [Julien] For the record. it's a reference to [XKCD's "Little Bobby Tables" 
> cartoon]. Both of you are saying it's harmless because you're assuming that 
> only the client role may read the data and act on it, but the whole point of 
> SQL injection is to try to do actions that the role you have access to can't 
> already do. And that's just a few out of dozens of examples of how having a 
> role connected to the database can do harm.
> 
> [Julien]
> - millions of 'cat' rows
> - millions of 1GB-large rows
 
 [Peter] That depends on "the database operations that the database design 
 specified", but if the client role is supposed to be able to insert data, 
 you can't really prevent it from inserting non-sensical or enormous data. 
 You can encapsulate the insert functionality in a function or procedure 
 and do some sanity checks there. But automatically distinguishing between 
 legitimate use and abuse is generally not simple.
> 
> [Julien] Which is exactly what was the idea behind Tom's "if you don't trust 
> another session that is running as your userID, you have already lost".
> 
> [Julien] or just keep sending massive invalid query texts to fill the 
> logs, or just trying to connect until there's no available connection 
> slots anymore, and then keep spamming the server thousands of time per 
> second to try to open new connections, or ...?
 
 [Peter] There are often several layers of defense. The database frequently 
 won't be accessible from the open internet (or even the company network) 
 directly. Only a middle tier of application servers running vetted client 
 code will connect directly. Even those servers may not be accessible 
 directly to end users. There may be a layer of proxy servers above them. 
 Each of these layers may implement additional checks, rate limits and 
 monitoring.
> 
> [Julien] If no one has direct SQL access to the database, then there's no 
> problem with a role being able to pg_terminate_backend() session for the same 
> role, and this thread shouldn't exist to begin with.
> 
>> [Bryn] I'm afraid that I didn't see this from you until I'd already replied 
>> to Julien's turn in this thread. Sorry that I caused thread divergence. 
>> Thanks, Peter, for addressing the contribution(s) that other tiers in the 
>> stack make (and uniquely are able to make) in order to deliver the intended 
>> application functionality to the end user.
> 
> [Julien] Your whole argument in your other email was:
> 
>> [Bryn] Anyway... this kind of denial of service discussion is way outside 
>> the scope of what I addressed.
> 
> [Julien] which looks like in total contradiction with your original email:
> 
>> Am I missing something? In other words, can anybody show me a vulnerability?
> 
> [Julien] Again, don't give SQL access to untrusted users and you will avoid a 
> lot of problems, including someone abusing pg_terminate_backend().

I fear that you and I, Julien, are talking past each other. That's probably my 
fault. Any maybe there's no hope of rescue now.

My brand new thread, started here:

https://www.postgresql.org/message-id/3d119733-6784-4e84-98e4-5124e69d4...@yugabyte.com

has nothing whatsoever to do with pg_terminate_backend(). The thread that I 
started here:

https://www.postgresql.org/message-id/10f360bb-3149-45e6-bffe-10b9ae31f...@yugabyte.com

to ask about pg_terminate_backend() reached a nice "case closed" for me because 
I filled an embarrassing hole in my understanding. I see now that, in a 
database of interest, I can revoke execute on the "killer" proc and grant it to 
any role that needs it. Doing this is nothing other than following the 
principle of least privilege. It's liberating to know that "Thing X" that you 
don't need can be made impossible. And so much nicer than applying the 
"what-about-ism" approach: "Thing X" brings some risks. But so what? "Thing Y", 
and its legion cousins, bring risks too—so don't worry about preventing "Thing 
X".

Now back to my new thread. I interpreted what Tom wrote to mean that he flatly 
rejected the idea that a database design was possible that prevented a client 
session that authorized as a role, that's designed for that purpose, from 

Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Julien Rouhaud
On Tue, Sep 27, 2022 at 05:27:22PM -0700, Bryn Llewellyn wrote:
> > hjp-pg...@hjp.at wrote:
> >
> >> rjuju...@gmail.com wrote:
> >>
> >>> b...@yugabyte.com wrote:
> >>>
> >>> My demo seems to show that when a program connects as "client", it can
> >>> perform exactly and only the database operations that the database design
> >>> specified. Am I missing something? In other words, can anybody show me a
> >>> vulnerability?
> >>
> >> What exactly prevents the client role from inserting e.g.
> >>
> >> - 'robert''); drop table students; --'
> >
> > It can do this but it won't do any harm since the client role doesn't have
> > permission to drop the table.

FTR it's a reference to https://xkcd.com/327/

Both of you are saying it's harmless because you're assuming that only the
client role may read the data and act on it, but the whole point of SQL
injection is to try to do actions that the role you have access to can't
already do.  And that's just a few out of dozens of examples of how having a
role connected to the database can do harm.

> >
> >> - millions of 'cat' rows
> >> - millions of 1GB-large rows
> >
> > That depends on "the database operations that the database design
> > specified", but if the client role is supposed to be able to insert data,
> > you can't really prevent it from inserting non-sensical or enormous data.
> > You can encapsulate the insert functionality in a function or procedure and
> > do some sanity checks there. But automatically distinguishing between
> > legitimate use and abuse is generally not simple.

Which is exactly what was the idea behind Tom's "if you don't trust another
session that is running as your userID, you have already lost".

> >
> >> or just keep sending massive invalid query texts to fill the logs, or just
> >> trying to connect until there's no available connection slots anymore, and
> >> then keep spamming the server thousands of time per second to try to open
> >> new connections, or ...?
> >
> > There are often several layers of defense. The database frequently won't be
> > accessible from the open internet (or even the company network) directly.
> > Only a middle tier of application servers running vetted client code will
> > connect directly. Even those servers may not be accessible directly to end
> > users. There may be a layer of proxy servers above them. Each of these
> > layers may implement additional checks, rate limits and monitoring.

If no one has direct SQL access to the database, then there's no problem with a
role being able to pg_terminate_backend() session for the same role, and this
thread shouldn't exist to begin with.

> I'm afraid that I didn't see this from you until I'd already replied to
> Julien's turn in this thread. Sorry that I caused thread divergence. Thanks,
> Peter, for addressing the contribution(s) that other tiers in the stack make
> (and uniquely are able to make) in order to deliver the intended application
> functionality to the end user.

Your whole argument in your other email was:

> Anyway... this kind of denial of service discussion is way outside the scope
> of what I addressed.

which looks like in total contradiction with your original email:

> Am I missing something? In other words, can anybody show me a vulnerability?

Again, don't give SQL access to untrusted users and you will avoid a lot of
problems, including someone abusing pg_terminate_backend().




Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Bryn Llewellyn
> hjp-pg...@hjp.at wrote:
> 
>> rjuju...@gmail.com wrote:
>> 
>>> b...@yugabyte.com wrote:
>>> 
>>> My demo seems to show that when a program connects as "client", it can 
>>> perform exactly and only the database operations that the database design 
>>> specified. Am I missing something? In other words, can anybody show me a 
>>> vulnerability?
>> 
>> What exactly prevents the client role from inserting e.g.
>> 
>> - 'robert''); drop table students; --'
> 
> It can do this but it won't do any harm since the client role doesn't have 
> permission to drop the table.
> 
>> - millions of 'cat' rows
>> - millions of 1GB-large rows
> 
> That depends on "the database operations that the database design specified", 
> but if the client role is supposed to be able to insert data, you can't 
> really prevent it from inserting non-sensical or enormous data. You can 
> encapsulate the insert functionality in a function or procedure and do some 
> sanity checks there. But automatically distinguishing between legitimate use 
> and abuse is generally not simple.
> 
>> or just keep sending massive invalid query texts to fill the logs, or just 
>> trying to connect until there's no available connection slots anymore, and 
>> then keep spamming the server thousands of time per second to try to open 
>> new connections, or ...?
> 
> There are often several layers of defense. The database frequently won't be 
> accessible from the open internet (or even the company network) directly. 
> Only a middle tier of application servers running vetted client code will 
> connect directly. Even those servers may not be accessible directly to end 
> users. There may be a layer of proxy servers above them. Each of these layers 
> may implement additional checks, rate limits and monitoring.

I'm afraid that I didn't see this from you until I'd already replied to 
Julien's turn in this thread. Sorry that I caused thread divergence. Thanks, 
Peter, for addressing the contribution(s) that other tiers in the stack make 
(and uniquely are able to make) in order to deliver the intended application 
functionality to the end user.



Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Bryn Llewellyn
> rjuju...@gmail.com wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>> My demo seems to show that when a program connects as "client", it can 
>> perform exactly and only the database operations that the database design 
>> specified. Am I missing something? In other words, can anybody show me a 
>> vulnerability?
> 
> What exactly prevents the client role from inserting e.g.
> 
> - 'robert''); drop table students; --'
> - millions of 'cat' rows
> - millions of 1GB-large rows
> 
> or just keep sending massive invalid query texts to fill the logs, or just 
> trying to connect until there's no available connection slots anymore, and 
> then keep spamming the server thousands of time per second to try to open new 
> connections, or ...?

My little code example was meant to show the basic principle: that the 
within-database artifacts that implement an application's backend can all be 
hidden from client code. The technique is simple and has been in use forever. 
That's why RDBMSs like PG have a privilege scheme. Object ownership and 
"current_role" are a central notions in any such scheme.

My demo depended upon an ordinary human regime of password secrecy. The key 
point that my demo made was that "client" owns nothing, lacks the "create" 
privilege on the database in question, and (for good measure) lacks "create" on 
all schemas in the database. Therefore, a session that authorizes as "client" 
is limited in what it can do.

I'm not sure what you mean to say with this fragment:

 'robert''); drop table students; --'

It rather looks like something that you see in an essay about SQL injection. 
But the entire SQL injection discussion is out of scope in my toy demo because 
the requirements statement simply allows a session that's authorized as 
"client" to issue any SQL statement. I don't know why you picked the "students" 
table when there isn't one. I just ran my demo code to completion, re-connected 
as "client", and did this:

drop table students;

It produces the "42P01: error: table "students" does not exist. Of course, the 
message isn't lying. So this is a better test:

drop view s.v;

This produces the "42501" error: must be owner of view v. This isn't a lie 
either. The hacker has now learned that, at least, such a view does exist. 
Arguably, the different between the two errors is a bad thing. And famously, in 
Oracle Database, you get a more generic "computer says no" in both cases. But 
PG is the way it is here and won't change in my lifetime. So, playing the 
hacker role, I tried this:

select definition from pg_views where schemaname = 's';

It caused the "42501" error: permission denied for view pg_views. And why 
shouldn't it? I didn't mention that I'd revoked "select" on every "pg_catalog" 
relation (and every "information_schema" relation) from public and then granted 
"select" explicitly on each to "u1" but not to "client".

This is the text-book principle of least privilege: you start with nothing and 
add what you need. For historical reasons, very few systems honor this 
principle by default. But it's an excellent feature of PG that you can overrule 
the default in the way that I described. The present toy demo works fine (all 
the tests behave the same) after my hardening intervention.

About inserting millions of rows, well... that's a word-game. The spec for my 
toy demo never mentioned that inserting millions of rows should be prevented.

There's only so far that you can go if you decide to articulate the hard-shell 
API as "use any SQL statement that you care to in order to access the intended 
app functionality". This is why the usual paradigm is to grant only "execute" 
on a designed set of subprograms that each implements a specified *business* 
function. People have been banging on about this approach since the late 
eighties (and probably since before then). Of course, the approach depends on a 
designed use of a privilege scheme. PG supports all this nicely. It's easy to 
implement an upper limit (in if-then-else code) on the number of rows that a 
procedure that implements "insert" allows. I s'pose that you'll say that the 
bad guy could call the procedure time and again. But techniques are available 
there too. (They're roughly analogous to what stops you making withdrawals from 
a bank account when the credit limit is reached.) Blocking a single "huge" row 
is trivial. Probably, a constraint that uses a SQL expression would suffice. 
But you can always implement the user-defined function for the hugeness test if 
you need to.

This leaves us with some kind of denial of service attack that uses a flavor of 
busy work or similar, like you mention. I don't think that there's any way that 
PG can prevent a connected role doing this:

do $body$
begin
  loop

  end loop;
end;
$body$;

or, say, a "select" with a recursive CTE with no stopping condition. There's 
always "set statement_timeout"—but that's in the hands of the session that 
authorizes as "client". I know of another 

Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-27 Thread Peter J. Holzer
On 2022-09-27 14:58:58 +0800, Julien Rouhaud wrote:
> On Mon, Sep 26, 2022 at 11:18:34AM -0700, Bryn Llewellyn wrote:
> > My demo seems to show that when a program connects as "client", it can
> > perform exactly and only the database operations that the database design
> > specified.
> >
> > Am I missing something? In other words, can anybody show me a vulnerability?
> 
> What exactly prevents the client role from inserting e.g.
> 
> - 'robert''); drop table students; --'

It can do this but it won't do any harm since the client role doesn't
have permission to drop the table-

> - millions of 'cat' rows
> - millions of 1GB-large rows

That depends on "the database operations that the database design
specified", but if the client role is supposed to be able to insert
data, you can't really prevent it from inserting non-sensical or
enormous data. You can encapsulate the insert functionality in a
function or procedure and do some sanity checks there. But automatically
distinguishing between legitimate use and abuse is generally not simple.

> or just keep sending massive invalid query texts to fill the logs, or just
> trying to connect until there's no available connection slots anymore, and 
> then
> keep spamming the server thousands of time per second to try to open new
> connections, or ...?

There are often several layers of defense. The database frequently won't
be accessible from the open internet (or even the company network)
directly. Only a middle tier of application servers running vetted
client code will connect directly. Even those servers may not be
accessible directly to end users. There may be a layer of proxy servers
above them. Each of these layers may implement additional checks, rate
limits and monitoring.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-26 Thread Julien Rouhaud
On Mon, Sep 26, 2022 at 11:18:34AM -0700, Bryn Llewellyn wrote:
>
> My demo seems to show that when a program connects as "client", it can
> perform exactly and only the database operations that the database design
> specified.
>
> Am I missing something? In other words, can anybody show me a vulnerability?

What exactly prevents the client role from inserting e.g.

- 'robert''); drop table students; --'
- millions of 'cat' rows
- millions of 1GB-large rows

or just keep sending massive invalid query texts to fill the logs, or just
trying to connect until there's no available connection slots anymore, and then
keep spamming the server thousands of time per second to try to open new
connections, or ...?




Limiting the operations that client-side code can perform upon its database backend's artifacts

2022-09-26 Thread Bryn Llewellyn
> On 14-Sep-2022, t...@sss.pgh.pa.us wrote:
> 
> …. Therefore, if you don't trust another session that is running as your 
> userID, you have already lost. That session can drop your tables, or corrupt 
> the data in those tables to an arbitrary extent, and the SQL permissions 
> system will not squawk even feebly… So if you're not happy with this hazard, 
> you should not be accepting the idea that actors you don't trust are allowed 
> to submit queries under the same userID as you. And if you're using a 
> client-side software stack that forces that situation on you, it's time to 
> look for another one.
> 
> Or in other words, I flatly reject the claim that this:
> 
>> b...@yugabyte.com wrote:
>> 
>> It's common to design a three tier app so that the middle tier always 
>> authorizes as just a single role—say, "client"—and where the operations that 
>> "client" can perform are limited as the overall design specifies.
> 
> is in any way sane or secure. There is not very much that the database server 
> can do to clean up after insecure client-side stacks.

*BACKGROUND*

I'm starting a new thread here. What I wrote, and Tom's response, are taken 
from a longish thread that I started with the subject "Is it possible to stop 
sessions killing each other when they all authorize as the same role?", here:

www.postgresql.org/message-id/10f360bb-3149-45e6-bffe-10b9ae31f...@yugabyte.com

That thread is "case closed" now. (My question arose from my basic 
misunderstanding of what's hard-wired and what is simply a default privilege 
regime that can be changed. And then I compounded my embarrassment by revoking 
"execute from public" on a "pg_catalog" function when "current_database()" had 
one value—and then not seeing the effect of this when "current_database()" had 
a different value.)

I wandered off topic with a claim about three tier app design. And that 
prompted Tom's response here:

https://www.postgresql.org/message-id/3100447.1663213208%40sss.pgh.pa.us



*ANYWAY...*

Tom's "I flatly reject" has been troubling me for the past couple of weeks. I 
wonder if what I wrote was unclear. I'll try a different way. First, w.r.t. 
Tom's

> the main point of a database is to store your data


I think that more needs to be said, thus:

« The main point of a database is to store your data, to keep it in compliance 
with all the specified data rules, and to allow authorized client-side code to 
modify the data by using only a set of specified business functions. »

This implies a carefully designed within-database regime that takes advantage 
of established notions: for encapsulating the implementation of business 
functions; and for access control. This, in turn, implies a minimum of two 
distinct roles: one to own the entire implementation. And another to allow 
exactly and only the specified business functions to be performed by 
client-side code. In a real use case, user-defined functions or procedures 
define the business function API. And there'd be typically several roles that 
share the implementation and that take advantage of access control notions 
among themselves. My code example, below, reduces this paradigm as far as I 
could manage to allow a convincing demo of the principles. It relies on this:

— People who implement client-side code to access the database are given *only* 
the credentials to connect as one particular user, "client", that exposes the 
business function API.

— All other connect credentials, including but not at all limited to superuser 
credentials, are kept secret within a manageably small community of server-side 
engineers.

— Self-evidently, this relies on carefully designed and strictly implemented 
human practices. But so, too, does any human endeavor where security matters. 
In our domain, this implies that the overall design has a carefully written 
prose specification and that the development shop delivers a set of install 
scripts. Then a trusted person whose job is to administer the deployed app 
scrutinizes the scripts and runs them. In the limit, just a single person knows 
the deployment site passwords and can set "rolcanlogin" to "false" for every 
role that owns the implementation artifacts once the installation is done.

My demo seems to show that when a program connects as "client", it can perform 
exactly and only the database operations that the database design specified.

Am I missing something? In other words, can anybody show me a vulnerability?

*THE DEMO*

The code example models the simplest form of "hard shell encapsulation" that I 
could manage.

(I now realize that, with some very manageable effort, I can revoke all 
privileges on every object in the "pg_catalog" schema from public and then 
re-grant as needed to whatever roles need them—following the famous principle 
of least privilege. So none would be granted to "client" with the result that 
it can't see metadata about anything. A prose document would suffice for 
communicating what client-side engin