This is an automated email from the ASF dual-hosted git repository. adelapena pushed a commit to branch cassandra-5.0 in repository https://gitbox.apache.org/repos/asf/cassandra.git
The following commit(s) were added to refs/heads/cassandra-5.0 by this push: new 5817799605 Restore accidentally deleted DDM doc 5817799605 is described below commit 58177996058a57e4909b11bcc6e754c8a6e38f6d Author: Andrés de la Peña <a.penya.gar...@gmail.com> AuthorDate: Fri Aug 18 11:01:45 2023 +0100 Restore accidentally deleted DDM doc patch by Andrés de la Peña; reviewed by Brandon Williams and Berenguer Blasi for CASSANDRA-18776 --- .../pages/developing/cql/dynamic_data_masking.adoc | 178 +++++++++++++++++++++ .../cassandra/pages/developing/cql/index.adoc | 2 +- 2 files changed, 179 insertions(+), 1 deletion(-) diff --git a/doc/modules/cassandra/pages/developing/cql/dynamic_data_masking.adoc b/doc/modules/cassandra/pages/developing/cql/dynamic_data_masking.adoc new file mode 100644 index 0000000000..8f523ac2ed --- /dev/null +++ b/doc/modules/cassandra/pages/developing/cql/dynamic_data_masking.adoc @@ -0,0 +1,178 @@ += Dynamic Data Masking + +Dynamic data masking (DDM) allows to obscure sensitive information while still allowing access to the masked columns. +DDM doesn't change the stored data. Instead, it just presents the data on their obscured form during `SELECT` queries. +This aims to provide some degree of protection against accidental data exposure. However, it's important to know that +anyone with direct access to the sstable files will be able to read the clear data. + +== Masking functions + +DDM is based on a set of CQL native functions that obscure sensitive information. The available functions are: + +include::partial$masking_functions.adoc[] + +Those functions can be discretionarily used on `SELECT` queries to get an obscured view of the data. For example: + +[source,cql] +---- +include::example$CQL/select_with_mask_functions.cql[] +---- + +== Attaching masking functions to table columns + +The masking functions can be permanently attached to the columns of a table. +In that case, `SELECT` queries will always return the column values in their masked form. +The masking will be transparent for the users running `SELECT` queries, +so their only way to know that a column is masked will be consulting the table definition. + +This is an optional feature that should be enabled with the `dynamic_data_masking_enabled` property in `cassandra.yaml`, +since it's disabled by default. + +The masks of the columns of a table can be defined on `CREATE TABLE` queries: + +[source,cql] +---- +include::example$CQL/ddm_create_table.cql[] +---- + +Note that in the example above we are referencing the `mask_inner` function with two arguments. +However, that CQL function actually has three arguments when explicitely used on `SELECT` queries. +The first argument is always ommitted when attaching the function to a schema column. +The value of that first argument is always interpreted as the value of the masked column, in this case a `text` column. +For the same reason the call to `mask_default` attached to the column doesn't have any argument, +even when that function requires one argument when explicitely used on `SELECT` queries. + +Data can be inserted into the masked table as usual. For example: + +[source,cql] +---- +include::example$CQL/ddm_insert_data.cql[] +---- + +The attached column masks will make `SELECT` queries automatically return masked data, +without the need of including the masking function on the query: + +[source,cql] +---- +include::example$CQL/ddm_select_with_masked_columns.cql[] +---- + +The masking function attached to a column can be changed with an `ALTER TABLE` query: + +[source,cql] +---- +include::example$CQL/ddm_alter_mask.cql[] +---- + +In a similar way, a masking function can be dettached from a column with an `ALTER TABLE` query: + +[source,cql] +---- +include::example$CQL/ddm_drop_mask.cql[] +---- + +== Permissions + +The `UNMASK` permission allows users to retrieve the unmasked values of masked columns. +The masks will only be applied to the results of a `SELECT` query if the user doesn't have the `UNMASK` permission. +Ordinary users are created without the `UNMASK` permission, whereas superusers do have it. + +As an example, suppose that we have a table with masked columns: + +[source,cql] +---- +include::example$CQL/ddm_create_table.cql[] +---- + +And we insert some data into the table: + +[source,cql] +---- +include::example$CQL/ddm_insert_data.cql[] +---- + +[source,cql] +---- +include::example$CQL/ddm_select_without_unmask_permission.cql[] +---- + +Then we create two users with `SELECT` permission for the table, but we only grant the `UNMASK` permission to one of +the users: + +[source,cql] +---- +include::example$CQL/ddm_create_users.cql[] +---- + +We can now see that the user with the `UNMASK` permission can see the clear data, without any masking: + +[source,cql] +---- +include::example$CQL/ddm_select_with_unmask_permission.cql[] +---- + +However, the user without the `UNMASK` permission can only see the masked data: + +[source,cql] +---- +include::example$CQL/ddm_select_without_unmask_permission.cql[] +---- + +The `UNMASK` permission works as any other permission. Thus, it can be revoked in any moment: + +[source,cql] +---- +include::example$CQL/ddm_revoke_unmask.cql[] +---- + +Please note that the anonymous user that is used when authentication is disabled has all the permissions. +Since it includes the `UNMASK` permission, that anonymous user will always see the clear data. +In other words, attaching data masking functions to columns only makes sense if authentication is enabled. + +Users without the `UNMASK` permission are not allowed to use masked columns in the `WHERE` clause of a `SELECT` query. +This prevents malicious users from figuring out the clear data by running exhaustive queries. For instance: + +[source,cql] +---- +include::example$CQL/ddm_select_without_select_masked.cql[] +---- + +However, there are some use cases where trusted database users just need a useful way to produce masked data +that will be served to untrusted external users. +For example, a trusted app can connect to the database and extract masked data that will be served to its end users. +In that case the trusted user (the app) can be given the `SELECT_MASKED` permission. +That permission allows to use masked columns in the `WHERE` clause of a `SELECT` query, +while still seeing the masked data in the query results. For instance: + +[source,cql] +---- +include::example$CQL/ddm_select_with_select_masked.cql[] +---- + +== Custom functions + +xref:developing/cql/functions.adoc#user-defined-scalar-functions[User-defined functions (UDFs)] can be attached to a table column. +The UDFs used for masking should belong to the same keyspace as the masked table. +The column value to mask will be passed as the first argument of the attached UDF. +Thus, the UDFs attached to a column should have at least one argument, +and that argument should have the same type as the masked column. +Also, the attached UDF should return values of the same type as the maked column. For instance: + +[source,cql] +---- +include::example$CQL/ddm_create_table_with_udf.cql[] +---- + +This creates a dependency between the table schema and the functions. +Any attempt to drop the function will be rejected while this dependency exists. +Thus, to drop the function you should first drop the mask. +This can be done with: + +[source,cql] +---- +include::example$CQL/ddm_drop_mask.cql[] +---- + +Dropping the column, or its containing table, or its containing keyspace would also remove the dependency. + +xref:developing/cql/functions.adoc#aggregate-functions[Aggregate functions] cannot be used as masking functions. diff --git a/doc/modules/cassandra/pages/developing/cql/index.adoc b/doc/modules/cassandra/pages/developing/cql/index.adoc index 84e84a9a21..4c7014269b 100644 --- a/doc/modules/cassandra/pages/developing/cql/index.adoc +++ b/doc/modules/cassandra/pages/developing/cql/index.adoc @@ -19,7 +19,7 @@ For that reason, when used in this document, these terms (tables, rows and colum * xref:cql/functions.adoc[Functions] * xref:cql/json.adoc[JSON] * xref:cql/security.adoc[CQL security] -* xref:cql/dynamic_data_masking.adoc[Dynamic data masking] +* xref:developing/cql/dynamic_data_masking.adoc[Dynamic data masking] * xref:cql/triggers.adoc[Triggers] * xref:cql/appendices.adoc[Appendices] * xref:cql/changes.adoc[Changes] --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org