[ https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384516#comment-16384516 ]
Joseph Lynch edited comment on CASSANDRA-12151 at 3/3/18 6:50 AM: ------------------------------------------------------------------ There are a lot of competing desires in this ticket, and I want to firmly +1 an incremental approach where we get the basic interface in and start adding new implementations or configuration options in follow up items. If I understand the comments above we're trying to solve the following all in one ticket: # Logging for security # Logging for business accounting compliance (e.g. SOX). # Logging for monetary transaction compliance (e.g. PCI). # Logging for replay later (e.g. for correctness testing) # Logging for debugging These require different implementations because they require different tradeoffs, and they may not all get done in this ticket. For example, CASSANDRA-13983 is focused on use case #4 and appears to use a highly optimized binary format which is lower overhead but does not meet requirements for #1/2/3 and requires custom parsers (you can't just hand the file to the auditors), whereas the patch [~vinaykumarcse] has provided I believe aims more for #1/2/3. Generally #2/3 require guarantees of logging if a query was attempted, regardless of it succeeded, and #2/3 generally require a lot more context than #1/#4 do. I think it would be great if in this ticket we can commit the basic interface and starting points of the configuration options (e.g. is it controlled through the cassandra.yaml or table options), and we can work on improving performance, configuration flexibility, etc in follow up tickets. [~jasobrown] wrote: {quote}I agree with [~djoshi3] and [~jjordan]: this functionality should really leverage the existing behavior of FQL (CASSANDRA-13893). There is no need to create a parallel or duplicate set of behaviors, unless it's completely warranted - and I have heard no arguments here that it is. {quote} Do you think the patch creates a parallel or duplicate set of behaviors? It provides a different query logging implementation that makes different tradeoffs and targets a different use case, but I think everyone is agreeing that we can unify the two behind one interface (we just have to make sure that interface has enough query context for all use cases which might be tough as FQL really doesn't need all the session context like user info but #1,2,3 do). [~eanujwa] wrote: {quote}If you are logging the exact query with all the values in case of regular queries (not prepared), then how would logging bind values of a prepared statement becomes a security concern? {quote} Generally speaking secure applications exclusively use prepared statements as simple statements are vulnerable to injection. Also, if you're using audit logging for PCI (or even SOX) the data in DML could easily be sensitive (e.g. credit cards or user's names), which you probably want to avoid by default. It could certainly be an option though. [~spo...@gmail.com] wrote: {quote}Usually you'll see two kind of users on production systems: privileged users and application users. Auditing privileged users (admins or developers) will almost always make sense, in order to be able to detect unauthorized access and data manipulation. There's only a limited amount of statements to log, as these will be executed manually. It also shouldn't matter which keyspaces or tables are access by the users; he is either monitored or not. {quote} Doesn't the category filter adequately achieve this (you could exclude DML or QUERY)? Do we need per user query logging when there is already per user permissions limiting their access to the database in the first place? {quote}However, auditing queries of application users has a very limited security and data privacy benefit, but adds a great deal of load to the database. Those queries will be automatically generated by the application and there will be no way to tell if the query or statement was authorized, as you don't know on behalf of whom it was executed. Any auditing functionality for these operations must therefor take place at application level. {quote} While I agree use case #1 (security) does not require this, use cases #2 and #3 very much do. For #2 or similar you typically have to prove that only authorized applications manipulated the database and a typical way to do that is to produce query logs showing that only trusted application IP addresses and specific credentials made DML statements (but QUERY is less important). For #3 the requirements are even greater, e.g. you may have to be able to prove that user data was not exfiltrated at all, requiring auditing of QUERY statements. Yes it's higher overhead but if you can turn it off with the category filters I think it's fine don't you? was (Author: jolynch): There are a lot of competing desires in this ticket, and I want to firmly +1 an incremental approach where we get the basic interface in and start adding new implementations or configuration options in follow up items. If I understand the comments above we're trying to solve the following all in one ticket: # Logging for security # Logging for business accounting compliance (e.g. SOX). # Logging for monetary transaction compliance (e.g. PCI). # Logging for replay later (e.g. for correctness testing) These require different implementations because they require different tradeoffs, and they may not all get done in this ticket. For example, CASSANDRA-13983 is focused on use case #4 and appears to use a highly optimized binary format which is lower overhead but does not meet requirements for #1/2/3 and requires custom parsers (you can't just hand the file to the auditors), whereas the patch [~vinaykumarcse] has provided I believe aims more for #1/2/3. Generally #2/3 require guarantees of logging if a query was attempted, regardless of it succeeded, and #2/3 generally require a lot more context than #1/#4 do. I think it would be great if in this ticket we can commit the basic interface and starting points of the configuration options (e.g. is it controlled through the cassandra.yaml or table options), and we can work on improving performance, configuration flexibility, etc in follow up tickets. [~jasobrown] wrote: {quote}I agree with [~djoshi3] and [~jjordan]: this functionality should really leverage the existing behavior of FQL (CASSANDRA-13893). There is no need to create a parallel or duplicate set of behaviors, unless it's completely warranted - and I have heard no arguments here that it is. {quote} Do you think the patch creates a parallel or duplicate set of behaviors? It provides a different query logging implementation that makes different tradeoffs and targets a different use case, but I think everyone is agreeing that we can unify the two behind one interface (we just have to make sure that interface has enough query context for all use cases which might be tough as FQL really doesn't need all the session context like user info but #1,2,3 do). [~eanujwa] wrote: {quote}If you are logging the exact query with all the values in case of regular queries (not prepared), then how would logging bind values of a prepared statement becomes a security concern? {quote} Generally speaking secure applications exclusively use prepared statements as simple statements are vulnerable to injection. Also, if you're using audit logging for PCI (or even SOX) the data in DML could easily be sensitive (e.g. credit cards or user's names), which you probably want to avoid by default. It could certainly be an option though. [~spo...@gmail.com] wrote: {quote}Usually you'll see two kind of users on production systems: privileged users and application users. Auditing privileged users (admins or developers) will almost always make sense, in order to be able to detect unauthorized access and data manipulation. There's only a limited amount of statements to log, as these will be executed manually. It also shouldn't matter which keyspaces or tables are access by the users; he is either monitored or not. {quote} Doesn't the category filter adequately achieve this (you could exclude DML or QUERY)? Do we need per user query logging when there is already per user permissions limiting their access to the database in the first place? {quote}However, auditing queries of application users has a very limited security and data privacy benefit, but adds a great deal of load to the database. Those queries will be automatically generated by the application and there will be no way to tell if the query or statement was authorized, as you don't know on behalf of whom it was executed. Any auditing functionality for these operations must therefor take place at application level. {quote} While I agree use case #1 (security) does not require this, use cases #2 and #3 very much do. For #2 or similar you typically have to prove that only authorized applications manipulated the database and a typical way to do that is to produce query logs showing that only trusted application IP addresses and specific credentials made DML statements (but QUERY is less important). For #3 the requirements are even greater, e.g. you may have to be able to prove that user data was not exfiltrated at all, requiring auditing of QUERY statements. Yes it's higher overhead but if you can turn it off with the category filters I think it's fine don't you? > Audit logging for database activity > ----------------------------------- > > Key: CASSANDRA-12151 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12151 > Project: Cassandra > Issue Type: New Feature > Reporter: stefan setyadi > Assignee: Vinay Chella > Priority: Major > Fix For: 4.x > > Attachments: 12151.txt, > DesignProposal_AuditingFeature_ApacheCassandra_v1.docx > > > we would like a way to enable cassandra to log database activity being done > on our server. > It should show username, remote address, timestamp, action type, keyspace, > column family, and the query statement. > it should also be able to log connection attempt and changes to the > user/roles. > I was thinking of making a new keyspace and insert an entry for every > activity that occurs. > Then It would be possible to query for specific activity or a query targeting > a specific keyspace and column family. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org