Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-31 Thread KaiGai Kohei
Nobody may remember, I also proposed a patch to support access
controls on largeobject.

It was suggested that largeobject stores its contents on the TOAST
relations and provides interfaces to read/write them partially.

For example:
 == lo_xxx() interfaces ==

 A new version of loread() and lowrite() are necessary to access
 a part of toasted data within user defined tables. It can be defined
 as follows:

   loread(Blob data, int32 offset, int32 length)
   lowrite(Blob data, int32 offset, Bytea data)

But, I found an another problem when we tried to update TOAST chunks
partially.

Because the toast pointer contains the total size of a series of
chunks, not only chunk_id and toastrelid, we cannot expand the size
of TOAST'ed data.

See below, toast pointer is defined as follows.

  struct varatt_external
  {
  int32   va_rawsize; /* Original data size (includes header) */
  int32   va_extsize; /* External saved size (doesn't) */
  Oid va_valueid; /* Unique ID of value within TOAST table */
  Oid va_toastrelid;  /* RelID of TOAST table containing it */
  };

These fields are inlined within the main tuple, and the contents
of TOAST datum is stored in the toast relation in separation.

If we can update the toast data partially with expanding its total size
(e.g, write 10,000 bytes from the offset 6,000 of the 12,000 bytes verlena),
it is also necessary to update the main tuple, because its va_rawsize and
va_extsize should be also updates, although va_valueid and va_toastrelid
can keep as is.

I wonder why the va_extsize and va_rawsize should be included within the
toast pointer. It is fundamentally computable using a pair of va_valueid
and va_toastrelid.
(I guess it is due to the performance reason. If we need to check whether
it is compressed or not, a flag varible can provide enough hint.)

So, we need to consider more how to implement partial writable verlena
on the user tables.

However, as far as existing largeobject interfaces, I believe it is possible
to implement them using partial writable verlena.
Because largeobject identifier is obvious when we user lo_*() interfaces,
so we can update the main tuple within pg_largeobject also, if the total
length of the verlena was changed on partial writes.

Thus, I would like to implement the feature with focusing on the existing
largeobject interfaces at the first step.
Partial writable TOAST in the user relation will come on the next.
(It also enables to keep the patch size smaller than all-in-a-patch.)

In summary, I try to implement it on the next commit fest.
- GRANT / REVOKE on largeobjects
- DAC access controls on largeobejcts
- It puts the contents of largeobject on TOAST relation.
  (including hole support)
- It provides largeobejct interfaces to write it partially.

Thanks,

KaiGai Kohei wrote:
 I summarized the design proposal and issues currently we have.
 
 I would like to see any comments corresponding to the proposition.
 Especially, selection of the snapshot is a headach issue for me.
 
 
 This project tries to solve two items listed at:
   http://wiki.postgresql.org/wiki/Todo#Binary_Data
 
  * Add security checks for large objects
  * Allow read/write into TOAST values like large objects
 
 = Introduction =
 
 We need to associate a metadata for a certain largeobject to
 implement security checks for largeobjects. However, the data
 structure of largeobjects are not suitable to manage its
 metadata (such as owner identifier, database acls ...) on
 a certain largeobject, because a largeobject is stored as
 separated page frames in the pg_largeobject system catalog.
 Thus, we need to revise the data structure to manage a certain
 largeobject.
 
 An interesting fact is a similarity of data structure between
 TOAST table and pg_lageobject.
 A TOAST relation is declared as follows:
   pg_toast_%u (
   chunk_idoid,
   chunk_seq   int4,
   chunk_data  bytea,
   unique(chunk_id, chunk_seq)
   )
 
 Definition of the pg_largeobject is as follows:
   pg_largeobject(
   loidoid,
   pageno  int4,
   databytea,
   unique(loid, pageno)
   )
 
 They have an identical data structure, so it is quite natural
 to utilize TOAST mechanism to store pagef rames of largeobject.
 
 = Design =
 
 In my plan, role of the pg_largeobject will be changed to
 manage metadata of largeobjects, and it will be redefined
 as follows:
 
   CATALOG(pg_largeobject,2613)
   {
   Oidloowner; /* OID of the owner */
   Oidlonsp;   /* OID of the namespace */
   aclitemloacl[1];/* ACL of the largeobject */
   Blob   lodata;  /* Contents of the largeobject */
   } FormData_pg_largeobejct;
 
 For access controls purpose, its ownership, namespace and ACLs
 are necessary. In addition, the Blob is a new type which support
 to read/write a its partial TOAST value.
 
 The current lo_xxx() 

Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-16 Thread Joshua Tolley
On Thu, Jul 16, 2009 at 01:49:14PM +0900, KaiGai Kohei wrote:
 Joshua,
 
 I found your name as a reviewer at the commitfest.postgresql.org.
 
 However, I don't think the initial proposal of the largeobject
 security is now on the state to be reviewed seriously.
 
 In my preference, it should be reworked based on the TOASTing
 approach as discussed in this thread previously, if we can
 find a reasonable solution for the issue I raised before.

For whatever it's worth, I consider my capability limited to making sure the
patch applies and coming up with a few interesting ways of testing it out, but
not seriously studying the code and knowing when there might be a competitive
alternative implementation. I don't yet understand the issues that have been
raised, but will quit working to review the patch if you feel it's not ready.
Thanks for letting me know; I hope a solution to the problems you've brought
up is forthcoming.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-16 Thread Robert Haas
2009/7/16 KaiGai Kohei kai...@ak.jp.nec.com:
 However, I don't think the initial proposal of the largeobject
 security is now on the state to be reviewed seriously.

OK, I am moving this patch to returned with feedback.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-16 Thread KaiGai Kohei
Robert Haas wrote:
 2009/7/16 KaiGai Kohei kai...@ak.jp.nec.com:
 However, I don't think the initial proposal of the largeobject
 security is now on the state to be reviewed seriously.
 
 OK, I am moving this patch to returned with feedback.

If possible, I would like to have a discussion to make consensus
about its design and interfaces before submitting my patch to
the next commit fest.

I'll submit its design proposal (including a few issues) again.
Is there anyone interested in?

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-16 Thread KaiGai Kohei
I summarized the design proposal and issues currently we have.

I would like to see any comments corresponding to the proposition.
Especially, selection of the snapshot is a headach issue for me.


This project tries to solve two items listed at:
  http://wiki.postgresql.org/wiki/Todo#Binary_Data

 * Add security checks for large objects
 * Allow read/write into TOAST values like large objects

= Introduction =

We need to associate a metadata for a certain largeobject to
implement security checks for largeobjects. However, the data
structure of largeobjects are not suitable to manage its
metadata (such as owner identifier, database acls ...) on
a certain largeobject, because a largeobject is stored as
separated page frames in the pg_largeobject system catalog.
Thus, we need to revise the data structure to manage a certain
largeobject.

An interesting fact is a similarity of data structure between
TOAST table and pg_lageobject.
A TOAST relation is declared as follows:
  pg_toast_%u (
  chunk_idoid,
  chunk_seq   int4,
  chunk_data  bytea,
  unique(chunk_id, chunk_seq)
  )

Definition of the pg_largeobject is as follows:
  pg_largeobject(
  loidoid,
  pageno  int4,
  databytea,
  unique(loid, pageno)
  )

They have an identical data structure, so it is quite natural
to utilize TOAST mechanism to store pagef rames of largeobject.

= Design =

In my plan, role of the pg_largeobject will be changed to
manage metadata of largeobjects, and it will be redefined
as follows:

  CATALOG(pg_largeobject,2613)
  {
  Oidloowner; /* OID of the owner */
  Oidlonsp;   /* OID of the namespace */
  aclitemloacl[1];/* ACL of the largeobject */
  Blob   lodata;  /* Contents of the largeobject */
  } FormData_pg_largeobejct;

For access controls purpose, its ownership, namespace and ACLs
are necessary. In addition, the Blob is a new type which support
to read/write a its partial TOAST value.

The current lo_xxx() interfaces will perform as a wrapper function
to access a certain pg_largeobject.lodata identified by a largeobject
handler. The loread(), lowrite() or similar interfaces will support
partial accesses on the Blob type. It enables user defined relation
to contain large data using TOAST mechanism, with reasonable resource
comsumption. (Note that TOAST replaces whole of the chunks with same
identifier, even if it changes just a single byte.)

= Interfaces =

== New type ==

We need a new variable length type that has the following feature,
to allow users partial accesses.

* It always use external TOAST table, independent from its size.
 If toasted data is stored as inline, we cannot update it independent
from the main table.
 It does not prevent partial read, but meaningless because inlined
data is enough small.

* It always store the data without any compression.
 We cannot easily compute required data offset on the compressed
 data. All the toasted data need to be uncompressed, for both of
 reader and writer access.

== lo_xxx() interfaces ==

A new version of loread() and lowrite() are necessary to access
a part of toasted data within user defined tables. It can be defined
as follows:

  loread(Blob data, int32 offset, int32 length)
  lowrite(Blob data, int32 offset, Bytea data)

== GRANT/REVOKE ==

When we access traditional largeobjects, reader permission (SELECT)
or writer permission (UPDATE) should be checked on accesses.

The GRANT/REVOKE statements are enhanced as follows:

  GRANT SELECT ON LARGE OBJECT 1234 TO kaigai;

It allows kaigai to read the largeobject: 1234.

= Issues =

== New pg_type.typstorage ==

The variable length data is always necessary to be stored in external
storage and uncompressed. The existing typstorage does not satisfies
the requirement, so we need to add a new pg_type.typstorage strategy.

The new typstorage strategy forces:
 - It always stores the given varlena data on external toast relation.
 - It always stores the given varlena data without any compression.

It will give us performance loss, so existing Text or Bytea will be
more suitable to store variable length data being not very large.

== Snapshot ==

The largeobject interface uses SnapshotNow for writable accesses, and
GetActiveSnapshot() for read-only accesses, but toast_fetch_datum()
uses SnapshotToast to scan the toast relation.

It seems to me SnapshotToast depends on an assumption that tuples
within TOAST relation does not have any multiple versions.
When we update a toast value, TOAST mechanism inserts whole of
variable length datum with a new chunk_id, and older chunks are
removed at toast_delete_datum().
The TOAST pointer is updated to the new chunk_id, and its visibility
is under MVCC controls.

The source code comments at HeapTupleSatisfiesToast() says as follows:

/*
 * HeapTupleSatisfiesToast
 *  True iff heap tuple is valid as a TOAST row.
 *
 * This is a 

Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-15 Thread KaiGai Kohei
Joshua,

I found your name as a reviewer at the commitfest.postgresql.org.

However, I don't think the initial proposal of the largeobject
security is now on the state to be reviewed seriously.

In my preference, it should be reworked based on the TOASTing
approach as discussed in this thread previously, if we can
find a reasonable solution for the issue I raised before.

KaiGai Kohei wrote:
 I could find one more issue when we apply largeobject-style interfaces
 on generic toasted varlena data.
 
 When we fetch a toasted datum, it scans the pg_toast_%u with SnapshotToast,
 because it assumes any toasted chunks don't have multiple versions, and
 visibility of the toast pointer always means visibility of the toast chunks.
 
 However, if we provide largeobject-style interfaces which allow partial
 updates on toasted varlena, it seems to me this assumption will get being
 incorrect.
 
 Is there any good idea?

The largeobject interface uses SnapshotNow for writable accesses, and
GetActiveSnapshot() for read-only accesses, but toast_fetch_datum()
uses SnapshotToast to scan the toast relation.

It seems to me SnapshotToast depends on an assumption that tuples
within TOAST relation does not have any multiple versions.
When we update a toast value, TOAST mechanism inserts whole of
variable length datum with a new chunk_id, and older chunks are
removed at toast_delete_datum().
The TOAST pointer is updated to the new chunk_id, and its visibility
is under MVCC controls.

The source code comments at HeapTupleSatisfiesToast() says as follows:

/*
 * HeapTupleSatisfiesToast
 *  True iff heap tuple is valid as a TOAST row.
 *
 * This is a simplified version that only checks for VACUUM moving conditions.
 * It's appropriate for TOAST usage because TOAST really doesn't want to do
 * its own time qual checks; if you can see the main table row that contains
 * a TOAST reference, you should be able to see the TOASTed value.  However,
 * vacuuming a TOAST table is independent of the main table, and in case such
 * a vacuum fails partway through, we'd better do this much checking.
 *
 * Among other things, this means you can't do UPDATEs of rows in a TOAST
 * table.
 */

If largeobjects-like interface is available to update a part of TOAST
values, we cannot keep the assumption.

I would like to conclude the issue before reworking it.

Is there any good idea?
If we follow the current largeobject manner, what problems can be found?

Thanks,

 KaiGai Kohei wrote:
 I concluded that the following issues should be solved when we apply
 largeobject-like interfaces on the big toasted data within general
 relations, not only pg_largeobject system catalog.

 At first, we need to add a new strategy to store the given varlena data
 on the external toast relation.
 If we try to seek and fetch a certain data chunk, it is necessary to be
 computable what chunk stores the required data specified by offset and
 length. So, the external chunks should be uncompressed always. It is a
 common requirement for both of read and write operations.
 If we try to update a part of the toasted data chunks, it should not be
 inlined independent from length of the datum, because we need to update
 whole the tuple which contains inlined toasted chunks in this case.
 If we open the toasted varlena with read-only mode, inlined one does not
 prevent anything. It is an issue for only write operation.

 I would like to add a new strategy on pg_type.typstorage with the following
 characteristics:
  1. It always stores the given varlena data without any compression.
 So, the given data is stored as a set of fixed-length chunks.
  2. It always stores the given varlena data on external toast relation.

 I suggest a new built-in type named BLOB which has an identical definition
 to BYTEA type, expect for its attstorage.

 Next, a different version of lo_open() should be provided to accept
 BLOB type as follows:

   SELECT pictname, lo_open(pictdata, x'2'::int) FROM my_picture;

 It will allocate a largeobject descriptor for the given BLOB data,
 and user can read and write using loread() and lowrite() interfaces.

 issue:
   In this case, should it hold the relation handler and locks on the
   my_picture relation, not only its toast relation?
 issue:
   Should the lo_open() with read-only mode be available on the existing
   TEXT or BYTEA types? I could not find any reason to deny them.

 Next, pg_largeobject system catalog can be redefined using the BLOB
 type as follows:

   CATALOG(pg_largeobject,2613)
   {
   Oid loowner;/* OID of the largeobject owner */
   Oid lonsp;  /* OID of the largeobject namespace */
   aclitem loacl[1];   /* access permissions */
   bloblodata; /* contents of the largeobject */
   } FormData_pg_largeobject;

 The existing largeobject interfaces perform on pg_largeobject.lodata
 specified by largeobject identifier.
 Rest of metadata can be used for access 

Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-01 Thread KaiGai Kohei
I could find one more issue when we apply largeobject-style interfaces
on generic toasted varlena data.

When we fetch a toasted datum, it scans the pg_toast_%u with SnapshotToast,
because it assumes any toasted chunks don't have multiple versions, and
visibility of the toast pointer always means visibility of the toast chunks.

However, if we provide largeobject-style interfaces which allow partial
updates on toasted varlena, it seems to me this assumption will get being
incorrect.

Is there any good idea?

KaiGai Kohei wrote:
 I concluded that the following issues should be solved when we apply
 largeobject-like interfaces on the big toasted data within general
 relations, not only pg_largeobject system catalog.
 
 At first, we need to add a new strategy to store the given varlena data
 on the external toast relation.
 If we try to seek and fetch a certain data chunk, it is necessary to be
 computable what chunk stores the required data specified by offset and
 length. So, the external chunks should be uncompressed always. It is a
 common requirement for both of read and write operations.
 If we try to update a part of the toasted data chunks, it should not be
 inlined independent from length of the datum, because we need to update
 whole the tuple which contains inlined toasted chunks in this case.
 If we open the toasted varlena with read-only mode, inlined one does not
 prevent anything. It is an issue for only write operation.
 
 I would like to add a new strategy on pg_type.typstorage with the following
 characteristics:
  1. It always stores the given varlena data without any compression.
 So, the given data is stored as a set of fixed-length chunks.
  2. It always stores the given varlena data on external toast relation.
 
 I suggest a new built-in type named BLOB which has an identical definition
 to BYTEA type, expect for its attstorage.
 
 Next, a different version of lo_open() should be provided to accept
 BLOB type as follows:
 
   SELECT pictname, lo_open(pictdata, x'2'::int) FROM my_picture;
 
 It will allocate a largeobject descriptor for the given BLOB data,
 and user can read and write using loread() and lowrite() interfaces.
 
 issue:
   In this case, should it hold the relation handler and locks on the
   my_picture relation, not only its toast relation?
 issue:
   Should the lo_open() with read-only mode be available on the existing
   TEXT or BYTEA types? I could not find any reason to deny them.
 
 Next, pg_largeobject system catalog can be redefined using the BLOB
 type as follows:
 
   CATALOG(pg_largeobject,2613)
   {
   Oid loowner;/* OID of the largeobject owner */
   Oid lonsp;  /* OID of the largeobject namespace */
   aclitem loacl[1];   /* access permissions */
   bloblodata; /* contents of the largeobject */
   } FormData_pg_largeobject;
 
 The existing largeobject interfaces perform on pg_largeobject.lodata
 specified by largeobject identifier.
 Rest of metadata can be used for access control purpose.
 
 Thanks,
 
 KaiGai Kohei wrote:
 Tom Lane wrote:
 Bernd Helmle maili...@oopsware.de writes:
 It might be interesting to dig into your proposal deeper in conjunction 
 with TOAST (you've already mentioned this TODO). Having serial access with 
 a nice interface into TOAST would be eliminating the need for 
 pg_largeobject completely (i'm not a big fan of this one-big-system-table 
 approach the old LO interface currently is).
 Yeah, it would be more useful probably to fix that than to add
 decoration to the LO facility.  Making LO more usable is just going to
 encourage people to bump into its other limitations (32-bit OIDs,
 32-bit object size, finite maximum size of pg_largeobject, lack of
 dead-object cleanup, etc etc).
 The reason why I tried to mention the named largeobject feature is
 that dac security checks on largeobject require them to belong to
 a certain schema, so I thought it is quite natural to have a string
 name. However, obviously, it is not a significant theme for me.

 I can also agree your opinion that largeobject interfaces should be
 redefined to access partial stuff of TOAST'ed verlena data structure,
 not only pg_largeobject.

 In this case, we will need a new pg_type.typstorage option which
 force to put the given verlena data on external relation without
 compression, because we cannot estimate the data offset in inlined
 or compressed external verlena data.

 I'll try to submit a design within a few days.
 Thanks,
 
 


-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-30 Thread KaiGai Kohei
I concluded that the following issues should be solved when we apply
largeobject-like interfaces on the big toasted data within general
relations, not only pg_largeobject system catalog.

At first, we need to add a new strategy to store the given varlena data
on the external toast relation.
If we try to seek and fetch a certain data chunk, it is necessary to be
computable what chunk stores the required data specified by offset and
length. So, the external chunks should be uncompressed always. It is a
common requirement for both of read and write operations.
If we try to update a part of the toasted data chunks, it should not be
inlined independent from length of the datum, because we need to update
whole the tuple which contains inlined toasted chunks in this case.
If we open the toasted varlena with read-only mode, inlined one does not
prevent anything. It is an issue for only write operation.

I would like to add a new strategy on pg_type.typstorage with the following
characteristics:
 1. It always stores the given varlena data without any compression.
So, the given data is stored as a set of fixed-length chunks.
 2. It always stores the given varlena data on external toast relation.

I suggest a new built-in type named BLOB which has an identical definition
to BYTEA type, expect for its attstorage.

Next, a different version of lo_open() should be provided to accept
BLOB type as follows:

  SELECT pictname, lo_open(pictdata, x'2'::int) FROM my_picture;

It will allocate a largeobject descriptor for the given BLOB data,
and user can read and write using loread() and lowrite() interfaces.

issue:
  In this case, should it hold the relation handler and locks on the
  my_picture relation, not only its toast relation?
issue:
  Should the lo_open() with read-only mode be available on the existing
  TEXT or BYTEA types? I could not find any reason to deny them.

Next, pg_largeobject system catalog can be redefined using the BLOB
type as follows:

  CATALOG(pg_largeobject,2613)
  {
  Oid loowner;/* OID of the largeobject owner */
  Oid lonsp;  /* OID of the largeobject namespace */
  aclitem loacl[1];   /* access permissions */
  bloblodata; /* contents of the largeobject */
  } FormData_pg_largeobject;

The existing largeobject interfaces perform on pg_largeobject.lodata
specified by largeobject identifier.
Rest of metadata can be used for access control purpose.

Thanks,

KaiGai Kohei wrote:
 Tom Lane wrote:
 Bernd Helmle maili...@oopsware.de writes:
 It might be interesting to dig into your proposal deeper in conjunction 
 with TOAST (you've already mentioned this TODO). Having serial access with 
 a nice interface into TOAST would be eliminating the need for 
 pg_largeobject completely (i'm not a big fan of this one-big-system-table 
 approach the old LO interface currently is).
 Yeah, it would be more useful probably to fix that than to add
 decoration to the LO facility.  Making LO more usable is just going to
 encourage people to bump into its other limitations (32-bit OIDs,
 32-bit object size, finite maximum size of pg_largeobject, lack of
 dead-object cleanup, etc etc).
 
 The reason why I tried to mention the named largeobject feature is
 that dac security checks on largeobject require them to belong to
 a certain schema, so I thought it is quite natural to have a string
 name. However, obviously, it is not a significant theme for me.
 
 I can also agree your opinion that largeobject interfaces should be
 redefined to access partial stuff of TOAST'ed verlena data structure,
 not only pg_largeobject.
 
 In this case, we will need a new pg_type.typstorage option which
 force to put the given verlena data on external relation without
 compression, because we cannot estimate the data offset in inlined
 or compressed external verlena data.
 
 I'll try to submit a design within a few days.
 Thanks,


-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-29 Thread Bernd Helmle



--On 29. Juni 2009 08:32:29 +0900 KaiGai Kohei kai...@ak.jp.nec.com wrote:


Yes, it intends to assign an identifier string not only numeric
large object identifier. The identifier string can be qualified
with a certain namespace as follows.

E.g)
 SELECT lo_open('my_picture01', x'4'::int);
 SELECT lo_create('pg_temp.my_musid02');

In the later case, the new largeobject will be reclaimed after
the session closed due to the temporary namespace.


I'm not sure about the usefulness of this. While having an identifier for a 
LO is nice, i believe most users store additional metadata about objects 
within their own tables anyways, linking the LO there. Also i doubt there 
is much need for temporary large objects (at least, i have no idea about 
this).


It might be interesting to dig into your proposal deeper in conjunction 
with TOAST (you've already mentioned this TODO). Having serial access with 
a nice interface into TOAST would be eliminating the need for 
pg_largeobject completely (i'm not a big fan of this one-big-system-table 
approach the old LO interface currently is).


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-29 Thread Tom Lane
Bernd Helmle maili...@oopsware.de writes:
 It might be interesting to dig into your proposal deeper in conjunction 
 with TOAST (you've already mentioned this TODO). Having serial access with 
 a nice interface into TOAST would be eliminating the need for 
 pg_largeobject completely (i'm not a big fan of this one-big-system-table 
 approach the old LO interface currently is).

Yeah, it would be more useful probably to fix that than to add
decoration to the LO facility.  Making LO more usable is just going to
encourage people to bump into its other limitations (32-bit OIDs,
32-bit object size, finite maximum size of pg_largeobject, lack of
dead-object cleanup, etc etc).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-29 Thread KaiGai Kohei
Tom Lane wrote:
 Bernd Helmle maili...@oopsware.de writes:
 It might be interesting to dig into your proposal deeper in conjunction 
 with TOAST (you've already mentioned this TODO). Having serial access with 
 a nice interface into TOAST would be eliminating the need for 
 pg_largeobject completely (i'm not a big fan of this one-big-system-table 
 approach the old LO interface currently is).
 
 Yeah, it would be more useful probably to fix that than to add
 decoration to the LO facility.  Making LO more usable is just going to
 encourage people to bump into its other limitations (32-bit OIDs,
 32-bit object size, finite maximum size of pg_largeobject, lack of
 dead-object cleanup, etc etc).

The reason why I tried to mention the named largeobject feature is
that dac security checks on largeobject require them to belong to
a certain schema, so I thought it is quite natural to have a string
name. However, obviously, it is not a significant theme for me.

I can also agree your opinion that largeobject interfaces should be
redefined to access partial stuff of TOAST'ed verlena data structure,
not only pg_largeobject.

In this case, we will need a new pg_type.typstorage option which
force to put the given verlena data on external relation without
compression, because we cannot estimate the data offset in inlined
or compressed external verlena data.

I'll try to submit a design within a few days.
Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-28 Thread KaiGai Kohei
Bernd Helmle wrote:
 --On 26. Juni 2009 13:08:37 +0900 KaiGai Kohei kai...@ak.jp.nec.com 
 wrote:
 
  * Is the named large object (including fully qualified one) worth?
It will enables us to specify a largeobject with human readable
identifier string.
 
 I don't understand the notion of this. Does this mean you can create a 
 LO with an identifier string, generated from (e.g.) your application?

Yes, it intends to assign an identifier string not only numeric
large object identifier. The identifier string can be qualified
with a certain namespace as follows.

E.g)
 SELECT lo_open('my_picture01', x'4'::int);
 SELECT lo_create('pg_temp.my_musid02');

In the later case, the new largeobject will be reclaimed after
the session closed due to the temporary namespace.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-26 Thread Bernd Helmle



--On 26. Juni 2009 13:08:37 +0900 KaiGai Kohei kai...@ak.jp.nec.com wrote:


 * Is the named large object (including fully qualified one) worth?
   It will enables us to specify a largeobject with human readable
   identifier string.


I don't understand the notion of this. Does this mean you can create a LO 
with an identifier string, generated from (e.g.) your application?


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-25 Thread KaiGai Kohei
This patch adds support DAC security checks on largeobjects.

Enhanced GRANT/REVOKE statement allows to set SELECT (read) and
UPDATE (write) permission on individual largeobjects.

At the creation time, it checks ACL_CREATE on the schema object.
Currently, a largeobject does not have any human readable name and
qualified namespace, we assume public namespace here.

At the deletion time, it checks ownership of the largeobject.
Only resource owner and superuser can drop largeobjects.

The ownership and schema can be set using:
  ALTER LARGE OBJECT lobj OWNER TO role;
  ALTER LARGE OBJECT lobj SET SCHEMA schema;

The current pg_largeobject system catalog cannot store metadata
of the largeobejcts, its data structure is modified.

  CATALOG(pg_largeobject,2613)
  {
  Oid   lonsp;  /* OID of the namespace */
  Oid   loowner;/* OID of the owner */
  aclitem   loacl[1];   /* access permissions */
  } FormData_pg_largeobject;

The current pg_largeobejct is renamed to pg_largeobject_data.
One or multiple tuples within pg_largeobject_data points to
a record within pg_largeobject which has a metadata of a
certain largeobject.

  CATALOG(pg_largeobject_data,2966) BKI_WITHOUT_OIDS
  {
  Oid   loid;   /* Identifier of large object */
  int4  pageno; /* Page number (starting from 0) */
  bytea data;   /* Data for page (may be zero-length) */
  } FormData_pg_largeobject_data;

Issues:
 * Is ALTER LARGE OBJECT interface suitable?
 * How we should consider the namespace (schema) and the ownership
   of the largeobejct?
 * Is the named large object (including fully qualified one) worth?
   It will enables us to specify a largeobject with human readable
   identifier string.
 * Is the data structure appropriate?
   - As an aside, the pg_largeobject_data has an identical definition
 with TOAST tables. It may be possible to store them within TOAST
 table.
 * If so, it may also resolve other Todo item.
   - Allow read/write into TOAST values like large objects

Memo:
  http://wiki.postgresql.org/wiki/Largeobject_Enhancement

Example:
  postgres=# REVOKE ALL ON LARGE OBJECT 1234 FROM PUBLIC;
  REVOKE
  postgres=# GRANT SELECT ON LARGE OBJECT 1234 TO ymj;
  GRANT
  postgres=# GRANT SELECT,UPDATE ON LARGE OBJECT 1234 TO tak;
  GRANT

  postgres=# \c - ymj
  psql (8.4rc2)
  You are now connected to database postgres as user ymj.
  postgres= SELECT loread(lo_open(1234, x'4'::int), 10);
 loread
  
   1234567890
  (1 row)

  postgres= SELECT lowrite(lo_open(1234, x'2'::int), 'abcdefg');
  ERROR:  permission denied for largeobject largeobject:1234

  postgres= \c - tak
  psql (8.4rc2)
  You are now connected to database postgres as user tak.
  postgres= SELECT lowrite(lo_open(1234, x'2'::int), 'abcdefg');
   lowrite
  -
 7
  (1 row)


Scale of the patch:
  [kai...@saba ]$ diffstat pgsql-lobj-perms-8.4rc2-r2080.patch
   backend/catalog/Makefile   |6
   backend/catalog/aclchk.c   |  243 ++
   backend/catalog/dependency.c   |   15 +
   backend/catalog/pg_largeobject.c   |  265 

   backend/commands/alter.c   |9 +
   backend/libpq/be-fsstubs.c |   25 +++
   backend/parser/gram.y  |   28 +++
   backend/storage/large_object/inv_api.c |  140 +++--!!
   backend/tcop/utility.c |6
   backend/utils/adt/acl.c|4
   backend/utils/cache/syscache.c |   13 +
   include/catalog/dependency.h   |1
   include/catalog/indexing.h |7
   include/catalog/pg_largeobject.h   |   21 !!
   include/catalog/pg_largeobject_data.h  |   54 ++
   include/catalog/toasting.h |1
   include/nodes/parsenodes.h |1
   include/utils/acl.h|7
   include/utils/syscache.h   |1
   test/regress/expected/sanity_check.out |3
   test/regress/input/largeobject.source  |   95 +++
   test/regress/output/largeobject.source |  175 +
   22 files changed, 803 insertions(+), 46 deletions(-), 271 modifications(!)

-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com


pgsql-lobj-perms-8.4rc2-r2080.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers