[ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384516#comment-16384516
 ] 

Joseph Lynch edited comment on CASSANDRA-12151 at 3/3/18 6:50 AM:
------------------------------------------------------------------

There are a lot of competing desires in this ticket, and I want to firmly +1 an 
incremental approach where we get the basic interface in and start adding new 
implementations or configuration options in follow up items. If I understand 
the comments above we're trying to solve the following all in one ticket:
 # Logging for security
 # Logging for business accounting compliance (e.g. SOX).
 # Logging for monetary transaction compliance (e.g. PCI).
 # Logging for replay later (e.g. for correctness testing)
 # Logging for debugging

These require different implementations because they require different 
tradeoffs, and they may not all get done in this ticket. For example, 
CASSANDRA-13983 is focused on use case #4 and appears to use a highly optimized 
binary format which is lower overhead but does not meet requirements for #1/2/3 
and requires custom parsers (you can't just hand the file to the auditors), 
whereas the patch [~vinaykumarcse] has provided I believe aims more for #1/2/3. 
Generally #2/3 require guarantees of logging if a query was attempted, 
regardless of it succeeded, and #2/3 generally require a lot more context than 
#1/#4 do. I think it would be great if in this ticket we can commit the basic 
interface and starting points of the configuration options (e.g. is it 
controlled through the cassandra.yaml or table options), and we can work on 
improving performance, configuration flexibility, etc in follow up tickets.

[~jasobrown] wrote:
{quote}I agree with [~djoshi3] and [~jjordan]: this functionality should really 
leverage the existing behavior of FQL (CASSANDRA-13893). There is no need to 
create a parallel or duplicate set of behaviors, unless it's completely 
warranted - and I have heard no arguments here that it is.
{quote}
Do you think the patch creates a parallel or duplicate set of behaviors? It 
provides a different query logging implementation that makes different 
tradeoffs and targets a different use case, but I think everyone is agreeing 
that we can unify the two behind one interface (we just have to make sure that 
interface has enough query context for all use cases which might be tough as 
FQL really doesn't need all the session context like user info but #1,2,3 do).

[~eanujwa] wrote:
{quote}If you are logging the exact query with all the values in case of 
regular queries (not prepared), then how would logging bind values of a 
prepared statement becomes a security concern?
{quote}
Generally speaking secure applications exclusively use prepared statements as 
simple statements are vulnerable to injection. Also, if you're using audit 
logging for PCI (or even SOX) the data in DML could easily be sensitive (e.g. 
credit cards or user's names), which you probably want to avoid by default. It 
could certainly be an option though.

[~spo...@gmail.com] wrote:
{quote}Usually you'll see two kind of users on production systems: privileged 
users and application users. Auditing privileged users (admins or developers) 
will almost always make sense, in order to be able to detect unauthorized 
access and data manipulation. There's only a limited amount of statements to 
log, as these will be executed manually. It also shouldn't matter which 
keyspaces or tables are access by the users; he is either monitored or not.
{quote}
Doesn't the category filter adequately achieve this (you could exclude DML or 
QUERY)? Do we need per user query logging when there is already per user 
permissions limiting their access to the database in the first place?
{quote}However, auditing queries of application users has a very limited 
security and data privacy benefit, but adds a great deal of load to the 
database. Those queries will be automatically generated by the application and 
there will be no way to tell if the query or statement was authorized, as you 
don't know on behalf of whom it was executed. Any auditing functionality for 
these operations must therefor take place at application level.
{quote}
While I agree use case #1 (security) does not require this, use cases #2 and #3 
very much do. For #2 or similar you typically have to prove that only 
authorized applications manipulated the database and a typical way to do that 
is to produce query logs showing that only trusted application IP addresses and 
specific credentials made DML statements (but QUERY is less important). For #3 
the requirements are even greater, e.g. you may have to be able to prove that 
user data was not exfiltrated at all, requiring auditing of QUERY statements. 
Yes it's higher overhead but if you can turn it off with the category filters I 
think it's fine don't you?


was (Author: jolynch):
There are a lot of competing desires in this ticket, and I want to firmly +1 an 
incremental approach where we get the basic interface in and start adding new 
implementations or configuration options in follow up items. If I understand 
the comments above we're trying to solve the following all in one ticket:
 # Logging for security
 # Logging for business accounting compliance (e.g. SOX).
 # Logging for monetary transaction compliance (e.g. PCI).
 # Logging for replay later (e.g. for correctness testing)

These require different implementations because they require different 
tradeoffs, and they may not all get done in this ticket. For example, 
CASSANDRA-13983 is focused on use case #4 and appears to use a highly optimized 
binary format which is lower overhead but does not meet requirements for #1/2/3 
and requires custom parsers (you can't just hand the file to the auditors), 
whereas the patch [~vinaykumarcse] has provided I believe aims more for #1/2/3. 
Generally #2/3 require guarantees of logging if a query was attempted, 
regardless of it succeeded, and #2/3 generally require a lot more context than 
#1/#4 do. I think it would be great if in this ticket we can commit the basic 
interface and starting points of the configuration options (e.g. is it 
controlled through the cassandra.yaml or table options), and we can work on 
improving performance, configuration flexibility, etc in follow up tickets.

[~jasobrown] wrote:
{quote}I agree with [~djoshi3] and [~jjordan]: this functionality should really 
leverage the existing behavior of FQL (CASSANDRA-13893). There is no need to 
create a parallel or duplicate set of behaviors, unless it's completely 
warranted - and I have heard no arguments here that it is.
{quote}
Do you think the patch creates a parallel or duplicate set of behaviors? It 
provides a different query logging implementation that makes different 
tradeoffs and targets a different use case, but I think everyone is agreeing 
that we can unify the two behind one interface (we just have to make sure that 
interface has enough query context for all use cases which might be tough as 
FQL really doesn't need all the session context like user info but #1,2,3 do).

[~eanujwa] wrote:
{quote}If you are logging the exact query with all the values in case of 
regular queries (not prepared), then how would logging bind values of a 
prepared statement becomes a security concern?
{quote}
Generally speaking secure applications exclusively use prepared statements as 
simple statements are vulnerable to injection. Also, if you're using audit 
logging for PCI (or even SOX) the data in DML could easily be sensitive (e.g. 
credit cards or user's names), which you probably want to avoid by default. It 
could certainly be an option though.

[~spo...@gmail.com] wrote:
{quote}Usually you'll see two kind of users on production systems: privileged 
users and application users. Auditing privileged users (admins or developers) 
will almost always make sense, in order to be able to detect unauthorized 
access and data manipulation. There's only a limited amount of statements to 
log, as these will be executed manually. It also shouldn't matter which 
keyspaces or tables are access by the users; he is either monitored or not.
{quote}
Doesn't the category filter adequately achieve this (you could exclude DML or 
QUERY)? Do we need per user query logging when there is already per user 
permissions limiting their access to the database in the first place?
{quote}However, auditing queries of application users has a very limited 
security and data privacy benefit, but adds a great deal of load to the 
database. Those queries will be automatically generated by the application and 
there will be no way to tell if the query or statement was authorized, as you 
don't know on behalf of whom it was executed. Any auditing functionality for 
these operations must therefor take place at application level.
{quote}
While I agree use case #1 (security) does not require this, use cases #2 and #3 
very much do. For #2 or similar you typically have to prove that only 
authorized applications manipulated the database and a typical way to do that 
is to produce query logs showing that only trusted application IP addresses and 
specific credentials made DML statements (but QUERY is less important). For #3 
the requirements are even greater, e.g. you may have to be able to prove that 
user data was not exfiltrated at all, requiring auditing of QUERY statements. 
Yes it's higher overhead but if you can turn it off with the category filters I 
think it's fine don't you?

> Audit logging for database activity
> -----------------------------------
>
>                 Key: CASSANDRA-12151
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12151
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: stefan setyadi
>            Assignee: Vinay Chella
>            Priority: Major
>             Fix For: 4.x
>
>         Attachments: 12151.txt, 
> DesignProposal_AuditingFeature_ApacheCassandra_v1.docx
>
>
> we would like a way to enable cassandra to log database activity being done 
> on our server.
> It should show username, remote address, timestamp, action type, keyspace, 
> column family, and the query statement.
> it should also be able to log connection attempt and changes to the 
> user/roles.
> I was thinking of making a new keyspace and insert an entry for every 
> activity that occurs.
> Then It would be possible to query for specific activity or a query targeting 
> a specific keyspace and column family.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to