Author: zznate
Date: Mon Oct 29 22:53:37 2018
New Revision: 1845181

URL: http://svn.apache.org/viewvc?rev=1845181&view=rev
Log:
CASSANDRA-14835 - Audit Logging in 4.0 blog post from Vinay Chella

Added:
    cassandra/site/publish/blog/2018/10/29/
    cassandra/site/publish/blog/2018/10/29/audit_logging_cassandra.html
    cassandra/site/src/_posts/2018-10-29-audit_logging_cassandra.markdown
Modified:
    cassandra/site/publish/blog/index.html
    cassandra/site/publish/feed.xml

Added: cassandra/site/publish/blog/2018/10/29/audit_logging_cassandra.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/blog/2018/10/29/audit_logging_cassandra.html?rev=1845181&view=auto
==============================================================================
--- cassandra/site/publish/blog/2018/10/29/audit_logging_cassandra.html (added)
+++ cassandra/site/publish/blog/2018/10/29/audit_logging_cassandra.html Mon Oct 
29 22:53:37 2018
@@ -0,0 +1,358 @@
+<!DOCTYPE html>
+<html>
+  
+
+
+
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+  <meta name="description" content="Database audit logging is an industry 
standard tool for enterprises tocapture critical data change events including 
what data changed and whotriggered the ev...">
+  <meta name="keywords" content="cassandra, apache, apache cassandra, 
distributed storage, key value store, scalability, bigtable, dynamo" />
+  <meta name="robots" content="index,follow" />
+  <meta name="language" content="en" />  
+
+  <title>Audit Logging in Apache Cassandra 4.0</title>
+
+  <link rel="canonical" 
href="http://cassandra.apache.org/blog/2018/10/29/audit_logging_cassandra.html";>
+
+  <link rel="stylesheet" 
href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css"; 
integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7"
 crossorigin="anonymous">
+  <link rel="stylesheet" href="./../../../../css/style.css">
+  
+
+  
+  <link rel="stylesheet" 
href="https://use.fontawesome.com/releases/v5.2.0/css/all.css"; 
integrity="sha384-hWVjflwFxL6sNzntih27bfxkr27PmbbK/iSvJ+a4+0owXq79v+lsFkW54bOGbiDQ"
 crossorigin="anonymous">
+  
+  <link type="application/atom+xml" rel="alternate" 
href="http://cassandra.apache.org/feed.xml"; title="Apache Cassandra Website" />
+</head>
+
+  <body>
+    <!-- breadcrumbs -->
+<div class="topnav">
+  <div class="container breadcrumb-container">
+    <ul class="breadcrumb">
+      <li>
+        <div class="dropdown">
+          <img class="asf-logo" src="./../../../../img/asf_feather.png" />
+          <a data-toggle="dropdown" href="#">Apache Software Foundation <span 
class="caret"></span></a>
+          <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+            <li><a href="http://www.apache.org";>Apache Homepage</a></li>
+            <li><a href="http://www.apache.org/licenses/";>License</a></li>
+            <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Sponsorship</a></li>
+            <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+            <li><a href="http://www.apache.org/security/";>Security</a></li>
+          </ul>
+        </div>
+      </li>
+
+      
+      <li><a href="./../../../../">Apache Cassandra</a></li>
+      
+
+      
+        
+        <li>Audit Logging in Apache Cassandra 4.0</li>
+        
+      
+
+      
+
+      
+    </ul>
+  </div>
+
+  <!-- navbar -->
+  <nav class="navbar navbar-default navbar-static-top" role="navigation">
+    <div class="container">
+      <div class="navbar-header">
+        <button type="button" class="navbar-toggle collapsed" 
data-toggle="collapse" data-target="#cassandra-menu" aria-expanded="false">
+          <span class="sr-only">Toggle navigation</span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+        </button>
+        <a class="navbar-brand" href="./../../../../"><img 
src="./../../../../img/cassandra_logo.png" alt="Apache Cassandra logo" /></a>
+      </div><!-- /.navbar-header -->
+
+      <div id="cassandra-menu" class="collapse navbar-collapse">
+        <ul class="nav navbar-nav navbar-right">
+          <li><a href="./../../../../">Home</a></li>
+          <li><a href="./../../../../download/">Download</a></li>
+          <li><a href="./../../../../doc/">Documentation</a></li>
+          <li><a href="./../../../../community/">Community</a></li>
+          <li>
+            <a href="./../../../../blog">Blog</a>                    
+        </li>
+        </ul>
+      </div><!-- /#cassandra-menu -->
+
+      
+    </div>
+  </nav><!-- /.navbar -->
+</div><!-- /.topnav -->
+
+    <div class="content">
+  <div class="container">
+  <h2>Audit Logging in Apache Cassandra 4.0</h2>
+    <p>Posted on October 29, 2018 by the Apache Cassandra Community</p>
+    <h5><a href="/blog">&laquo; Back to the Apache Cassandra Blog</a></h5>
+    <hr />
+  <p>Database audit logging is an industry standard tool for enterprises to
+capture critical data change events including what data changed and who
+triggered the event. These captured records can then be reviewed later
+to ensure compliance with regulatory, security and operational policies.</p>
+
+<p>Prior to Apache Cassandra 4.0, the open source community did not have a
+good way of tracking such critical database activity. With this goal in
+mind, Netflix implemented
+<a 
href="https://issues.apache.org/jira/browse/CASSANDRA-12151";>CASSANDRA-12151</a>
+so that users of Cassandra would have a simple yet powerful audit
+logging tool built into their database out of the box.</p>
+
+<h2 id="why-are-audit-logs-important">Why are Audit Logs Important?</h2>
+
+<p>Audit logging database activity is one of the key components for making
+a database truly ready for the enterprise. Audit logging is generally
+useful but enterprises frequently use it for:</p>
+
+<ol>
+  <li>Regulatory compliance with laws such as <a 
href="https://en.wikipedia.org/wiki/Sarbanes%E2%80%93Oxley_Act";>SOX</a>, <a 
href="https://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard";>PCI</a>
 and <a 
href="https://en.wikipedia.org/wiki/General_Data_Protection_Regulation";>GDPR</a>
 et al. These types of compliance are crucial for companies that are traded on 
public stock exchanges, hold payment information such as credit cards, or 
retain private user information.</li>
+  <li>Security compliance. Companies often have strict rules for what data can 
be accessed by which employees, both to protect the privacy of users but also 
to limit the probability of a data breach.</li>
+  <li>Debugging complex data corruption bugs such as those found in massively 
distributed microservice architectures like Netflix’s.</li>
+</ol>
+
+<h2 id="why-is-audit-logging-difficult">Why is Audit Logging Difficult?</h2>
+
+<p>Implementing a simple logger in the request (inbound/outbound) path
+sounds easy, but the devil is in the details. In particular, the “fast
+path” of a database, where audit logging must operate, strives to do as
+little as humanly possible so that users get the fastest and most
+scalable database system possible. While implementing Cassandra audit
+logging, we had to ensure that the audit log infrastructure does not
+take up excessive CPU or IO resources from the actual database execution
+itself. However, one cannot simply optimize only for performance because
+that may compromise the guarantees of the audit logging.</p>
+
+<p>For example, if producing an audit record would block a thread, it
+should be dropped to maintain maximum performance. However, most
+compliance requirements prohibit dropping records. Therefore, the key to
+implementing audit logging correctly lies in allowing users to achieve
+both performance <em>and</em> reliability, or absent being able to achieve both
+allow users to make an explicit trade-off through configuration.</p>
+
+<hr />
+
+<h2 id="audit-logging-design-goals">Audit Logging Design Goals</h2>
+
+<p>The design goal of the Audit log are broadly categorized into 3
+different areas:</p>
+
+<p><strong>Performance</strong>: Considering the Audit Log injection points are
+live in the request path, performance is an important goal in every
+design decision.</p>
+
+<p><strong>Accuracy</strong> : Accuracy is required by compliance and is thus a
+critical goal. Audit Logging must be able to answer crucial auditor
+questions like “Is every write request to the database being audited?”.
+As such, accuracy cannot be compromised.</p>
+
+<p><strong>Usability &amp; Extensibility</strong>: The diverse Cassandra 
ecosystem
+demands that any frequently used feature must be easily usable and
+pluggable (e.g., Compaction, Compression, SeedProvider etc...), so the
+Audit Log interface was designed with this context in mind from the
+start.</p>
+
+<h2 id="implementation">Implementation</h2>
+
+<p>With these three design goals in mind, the
+<a href="https://github.com/OpenHFT";>OpenHFT</a> libraries were an
+obvious choice due to their reliability and high performance. Earlier in
+<a 
href="https://issues.apache.org/jira/browse/CASSANDRA-13983";>CASSANDRA-13983</a>
+the <a href="https://github.com/OpenHFT/Chronicle-Queue";>chronical queue
+library</a> of
+OpenHFT was introduced as a BinLog utility to the Apache Cassandra code
+base. The performance of Full Query Logging (FQL) was excellent, but it only 
instrumented mutation and read query paths. It was missing a lot of critical 
data such as when queries failed, where they came from, and which user issued 
the query. The FQL was also single purpose: preferring to drop messages rather 
than delay the process (which makes sense for FQL but not for Audit Logging). 
Lastly, the FQL didn’t allow for pluggability, which would make it harder to 
adopt in the codebase for this feature.</p>
+
+<p>As shown in the architecture figure below, we were able to unify the FQL 
feature with the AuditLog functionality through the AuditLogManager and 
IAuditLogger abstractions.  Using this architecture, we can support any output 
format: logs, files, databases, etc. By default, the BinAuditLogger 
implementation comes out of the box to maintain performance. Users can choose 
the custom audit logger implementation by dropping the jar file on Cassandra 
classpath and customizing with configuration options in
+<a 
href="https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L1216-L1234";>cassandra.yaml</a>
+file.</p>
+
+<hr />
+
+<h2 id="architecture">Architecture</h2>
+

[... 175 lines stripped ...]
Modified: cassandra/site/publish/blog/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/blog/index.html?rev=1845181&r1=1845180&r2=1845181&view=diff
==============================================================================
--- cassandra/site/publish/blog/index.html (original)
+++ cassandra/site/publish/blog/index.html Mon Oct 29 22:53:37 2018
@@ -102,6 +102,18 @@
     <ul class="blog-post-listing">
       
         <li class="blog-post">
+          <h4><a href="/blog/2018/10/29/audit_logging_cassandra.html">Audit 
Logging in Apache Cassandra 4.0</a></h4>
+          <p>Posted on October 29, 2018 by the Apache Cassandra Community</p>
+          <p>Database audit logging is an industry standard tool for 
enterprises to
+capture critical data change events including what data changed and who
+triggered the event. These captured records can then be reviewed later
+to ensure compliance with regulatory, security and operational policies.</p>
+
+
+          <h5><a href="/blog/2018/10/29/audit_logging_cassandra.html">Read 
more &raquo;</a></h5>
+        </li>
+      
+        <li class="blog-post">
           <h4><a 
href="/blog/2018/10/17/finding_bugs_with_property_based_testing.html">Finding 
Bugs in Cassandra's Internals with Property-based Testing</a></h4>
           <p>Posted on October 17, 2018 by the Apache Cassandra Community</p>
           <p>As of September 1st, the Apache Cassandra community has shifted 
the focus of Cassandra 4.0 development from new feature work to testing, 
validation, and hardening, with the goal of releasing a stable 4.0 that every 
Cassandra user, from small deployments to large corporations, can deploy with 
confidence. There are several projects and methodologies that the community is 
undertaking to this end. One of these is the adoption of property-based 
testing, which was <a 
href="http://cassandra.apache.org/blog/2018/08/21/testing_apache_cassandra.html";>previously
 introduced here</a>. This post will take a look at a specific use of this 
approach and how it found a bug in a new feature meant to ensure data integrity 
between the client and Cassandra.</p>

Modified: cassandra/site/publish/feed.xml
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/feed.xml?rev=1845181&r1=1845180&r2=1845181&view=diff
==============================================================================
--- cassandra/site/publish/feed.xml (original)
+++ cassandra/site/publish/feed.xml Mon Oct 29 22:53:37 2018
@@ -1,5 +1,208 @@
-<?xml version="1.0" encoding="utf-8"?><feed 
xmlns="http://www.w3.org/2005/Atom"; ><generator uri="https://jekyllrb.com/"; 
version="3.4.3">Jekyll</generator><link 
href="http://cassandra.apache.org/feed.xml"; rel="self" 
type="application/atom+xml" /><link href="http://cassandra.apache.org/"; 
rel="alternate" type="text/html" 
/><updated>2018-10-18T14:18:30+13:00</updated><id>http://cassandra.apache.org/</id><title
 type="html">Apache Cassandra Website</title><subtitle>The Apache Cassandra 
database is the right choice when you need scalability and high availability 
without compromising performance. Linear scalability and proven fault-tolerance 
on commodity hardware or cloud infrastructure make it the perfect platform for 
mission-critical data. Cassandra's support for replicating across multiple 
datacenters is best-in-class, providing lower latency for your users and the 
peace of mind of knowing that you can survive regional outages.
-</subtitle><entry><title type="html">Finding Bugs in Cassandra’s Internals 
with Property-based Testing</title><link 
href="http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html";
 rel="alternate" type="text/html" title="Finding Bugs in Cassandra's Internals 
with Property-based Testing" 
/><published>2018-10-17T20:00:00+13:00</published><updated>2018-10-17T20:00:00+13:00</updated><id>http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing</id><content
 type="html" 
xml:base="http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html";>&lt;p&gt;As
 of September 1st, the Apache Cassandra community has shifted the focus of 
Cassandra 4.0 development from new feature work to testing, validation, and 
hardening, with the goal of releasing a stable 4.0 that every Cassandra user, 
from small deployments to large corporations, can deploy with confidence. There 
are several projects and methodologies that
  the community is undertaking to this end. One of these is the adoption of 
property-based testing, which was &lt;a 
href=&quot;http://cassandra.apache.org/blog/2018/08/21/testing_apache_cassandra.html&quot;&gt;previously
 introduced here&lt;/a&gt;. This post will take a look at a specific use of 
this approach and how it found a bug in a new feature meant to ensure data 
integrity between the client and Cassandra.&lt;/p&gt;
+<?xml version="1.0" encoding="utf-8"?><feed 
xmlns="http://www.w3.org/2005/Atom"; ><generator uri="https://jekyllrb.com/"; 
version="3.4.3">Jekyll</generator><link 
href="http://cassandra.apache.org/feed.xml"; rel="self" 
type="application/atom+xml" /><link href="http://cassandra.apache.org/"; 
rel="alternate" type="text/html" 
/><updated>2018-10-30T11:50:41+13:00</updated><id>http://cassandra.apache.org/</id><title
 type="html">Apache Cassandra Website</title><subtitle>The Apache Cassandra 
database is the right choice when you need scalability and high availability 
without compromising performance. Linear scalability and proven fault-tolerance 
on commodity hardware or cloud infrastructure make it the perfect platform for 
mission-critical data. Cassandra's support for replicating across multiple 
datacenters is best-in-class, providing lower latency for your users and the 
peace of mind of knowing that you can survive regional outages.
+</subtitle><entry><title type="html">Audit Logging in Apache Cassandra 
4.0</title><link 
href="http://cassandra.apache.org/blog/2018/10/29/audit_logging_cassandra.html"; 
rel="alternate" type="text/html" title="Audit Logging in Apache Cassandra 4.0" 
/><published>2018-10-29T20:00:00+13:00</published><updated>2018-10-29T20:00:00+13:00</updated><id>http://cassandra.apache.org/blog/2018/10/29/audit_logging_cassandra</id><content
 type="html" 
xml:base="http://cassandra.apache.org/blog/2018/10/29/audit_logging_cassandra.html";>&lt;p&gt;Database
 audit logging is an industry standard tool for enterprises to
+capture critical data change events including what data changed and who
+triggered the event. These captured records can then be reviewed later
+to ensure compliance with regulatory, security and operational 
policies.&lt;/p&gt;
+
+&lt;p&gt;Prior to Apache Cassandra 4.0, the open source community did not have 
a
+good way of tracking such critical database activity. With this goal in
+mind, Netflix implemented
+&lt;a 
href=&quot;https://issues.apache.org/jira/browse/CASSANDRA-12151&quot;&gt;CASSANDRA-12151&lt;/a&gt;
+so that users of Cassandra would have a simple yet powerful audit
+logging tool built into their database out of the box.&lt;/p&gt;
+
+&lt;h2 id=&quot;why-are-audit-logs-important&quot;&gt;Why are Audit Logs 
Important?&lt;/h2&gt;
+
+&lt;p&gt;Audit logging database activity is one of the key components for 
making
+a database truly ready for the enterprise. Audit logging is generally
+useful but enterprises frequently use it for:&lt;/p&gt;
+
+&lt;ol&gt;
+  &lt;li&gt;Regulatory compliance with laws such as &lt;a 
href=&quot;https://en.wikipedia.org/wiki/Sarbanes%E2%80%93Oxley_Act&quot;&gt;SOX&lt;/a&gt;,
 &lt;a 
href=&quot;https://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard&quot;&gt;PCI&lt;/a&gt;
 and &lt;a 
href=&quot;https://en.wikipedia.org/wiki/General_Data_Protection_Regulation&quot;&gt;GDPR&lt;/a&gt;
 et al. These types of compliance are crucial for companies that are traded on 
public stock exchanges, hold payment information such as credit cards, or 
retain private user information.&lt;/li&gt;
+  &lt;li&gt;Security compliance. Companies often have strict rules for what 
data can be accessed by which employees, both to protect the privacy of users 
but also to limit the probability of a data breach.&lt;/li&gt;
+  &lt;li&gt;Debugging complex data corruption bugs such as those found in 
massively distributed microservice architectures like Netflix’s.&lt;/li&gt;
+&lt;/ol&gt;
+
+&lt;h2 id=&quot;why-is-audit-logging-difficult&quot;&gt;Why is Audit Logging 
Difficult?&lt;/h2&gt;
+
+&lt;p&gt;Implementing a simple logger in the request (inbound/outbound) path
+sounds easy, but the devil is in the details. In particular, the “fast
+path” of a database, where audit logging must operate, strives to do as
+little as humanly possible so that users get the fastest and most
+scalable database system possible. While implementing Cassandra audit
+logging, we had to ensure that the audit log infrastructure does not
+take up excessive CPU or IO resources from the actual database execution
+itself. However, one cannot simply optimize only for performance because
+that may compromise the guarantees of the audit logging.&lt;/p&gt;
+
+&lt;p&gt;For example, if producing an audit record would block a thread, it
+should be dropped to maintain maximum performance. However, most
+compliance requirements prohibit dropping records. Therefore, the key to
+implementing audit logging correctly lies in allowing users to achieve
+both performance &lt;em&gt;and&lt;/em&gt; reliability, or absent being able to 
achieve both
+allow users to make an explicit trade-off through configuration.&lt;/p&gt;
+
+&lt;hr /&gt;
+
+&lt;h2 id=&quot;audit-logging-design-goals&quot;&gt;Audit Logging Design 
Goals&lt;/h2&gt;
+
+&lt;p&gt;The design goal of the Audit log are broadly categorized into 3
+different areas:&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Considering the Audit Log 
injection points are
+live in the request path, performance is an important goal in every
+design decision.&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;Accuracy&lt;/strong&gt; : Accuracy is required by 
compliance and is thus a
+critical goal. Audit Logging must be able to answer crucial auditor
+questions like “Is every write request to the database being audited?”.
+As such, accuracy cannot be compromised.&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;Usability &amp;amp; Extensibility&lt;/strong&gt;: The 
diverse Cassandra ecosystem
+demands that any frequently used feature must be easily usable and
+pluggable (e.g., Compaction, Compression, SeedProvider etc...), so the
+Audit Log interface was designed with this context in mind from the
+start.&lt;/p&gt;
+
+&lt;h2 id=&quot;implementation&quot;&gt;Implementation&lt;/h2&gt;
+
+&lt;p&gt;With these three design goals in mind, the
+&lt;a href=&quot;https://github.com/OpenHFT&quot;&gt;OpenHFT&lt;/a&gt; 
libraries were an
+obvious choice due to their reliability and high performance. Earlier in
+&lt;a 
href=&quot;https://issues.apache.org/jira/browse/CASSANDRA-13983&quot;&gt;CASSANDRA-13983&lt;/a&gt;
+the &lt;a 
href=&quot;https://github.com/OpenHFT/Chronicle-Queue&quot;&gt;chronical queue
+library&lt;/a&gt; of
+OpenHFT was introduced as a BinLog utility to the Apache Cassandra code
+base. The performance of Full Query Logging (FQL) was excellent, but it only 
instrumented mutation and read query paths. It was missing a lot of critical 
data such as when queries failed, where they came from, and which user issued 
the query. The FQL was also single purpose: preferring to drop messages rather 
than delay the process (which makes sense for FQL but not for Audit Logging). 
Lastly, the FQL didn’t allow for pluggability, which would make it harder to 
adopt in the codebase for this feature.&lt;/p&gt;
+
+&lt;p&gt;As shown in the architecture figure below, we were able to unify the 
FQL feature with the AuditLog functionality through the AuditLogManager and 
IAuditLogger abstractions.  Using this architecture, we can support any output 
format: logs, files, databases, etc. By default, the BinAuditLogger 
implementation comes out of the box to maintain performance. Users can choose 
the custom audit logger implementation by dropping the jar file on Cassandra 
classpath and customizing with configuration options in
+&lt;a 
href=&quot;https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L1216-L1234&quot;&gt;cassandra.yaml&lt;/a&gt;
+file.&lt;/p&gt;
+
+&lt;hr /&gt;
+
+&lt;h2 id=&quot;architecture&quot;&gt;Architecture&lt;/h2&gt;
+

[... 124 lines stripped ...]
Added: cassandra/site/src/_posts/2018-10-29-audit_logging_cassandra.markdown
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/_posts/2018-10-29-audit_logging_cassandra.markdown?rev=1845181&view=auto
==============================================================================
--- cassandra/site/src/_posts/2018-10-29-audit_logging_cassandra.markdown 
(added)
+++ cassandra/site/src/_posts/2018-10-29-audit_logging_cassandra.markdown Mon 
Oct 29 22:53:37 2018
@@ -0,0 +1,211 @@
+---
+layout: post
+title: "Audit Logging in Apache Cassandra 4.0"
+date:   2018-10-29 00:00:00 -0700
+author: the Apache Cassandra Community
+categories: blog
+---
+
+Database audit logging is an industry standard tool for enterprises to
+capture critical data change events including what data changed and who
+triggered the event. These captured records can then be reviewed later
+to ensure compliance with regulatory, security and operational policies.
+
+Prior to Apache Cassandra 4.0, the open source community did not have a
+good way of tracking such critical database activity. With this goal in
+mind, Netflix implemented
+[CASSANDRA-12151](https://issues.apache.org/jira/browse/CASSANDRA-12151)
+so that users of Cassandra would have a simple yet powerful audit
+logging tool built into their database out of the box.
+
+## Why are Audit Logs Important?
+
+Audit logging database activity is one of the key components for making
+a database truly ready for the enterprise. Audit logging is generally
+useful but enterprises frequently use it for:
+
+1.  Regulatory compliance with laws such as 
[SOX](https://en.wikipedia.org/wiki/Sarbanes%E2%80%93Oxley_Act), 
[PCI](https://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard)
 and [GDPR](https://en.wikipedia.org/wiki/General_Data_Protection_Regulation) 
et al. These types of compliance are crucial for companies that are traded on 
public stock exchanges, hold payment information such as credit cards, or 
retain private user information.
+2.  Security compliance. Companies often have strict rules for what data can 
be accessed by which employees, both to protect the privacy of users but also 
to limit the probability of a data breach.
+3.  Debugging complex data corruption bugs such as those found in massively 
distributed microservice architectures like Netflix's.
+
+## Why is Audit Logging Difficult?
+
+Implementing a simple logger in the request (inbound/outbound) path
+sounds easy, but the devil is in the details. In particular, the "fast
+path" of a database, where audit logging must operate, strives to do as
+little as humanly possible so that users get the fastest and most
+scalable database system possible. While implementing Cassandra audit
+logging, we had to ensure that the audit log infrastructure does not
+take up excessive CPU or IO resources from the actual database execution
+itself. However, one cannot simply optimize only for performance because
+that may compromise the guarantees of the audit logging.
+
+For example, if producing an audit record would block a thread, it
+should be dropped to maintain maximum performance. However, most
+compliance requirements prohibit dropping records. Therefore, the key to
+implementing audit logging correctly lies in allowing users to achieve
+both performance *and* reliability, or absent being able to achieve both
+allow users to make an explicit trade-off through configuration.
+
+---
+
+## Audit Logging Design Goals
+
+The design goal of the Audit log are broadly categorized into 3
+different areas:
+
+**Performance**: Considering the Audit Log injection points are
+live in the request path, performance is an important goal in every
+design decision.
+
+**Accuracy** : Accuracy is required by compliance and is thus a
+critical goal. Audit Logging must be able to answer crucial auditor
+questions like "Is every write request to the database being audited?".
+As such, accuracy cannot be compromised.
+
+**Usability & Extensibility**: The diverse Cassandra ecosystem
+demands that any frequently used feature must be easily usable and
+pluggable (e.g., Compaction, Compression, SeedProvider etc\...), so the
+Audit Log interface was designed with this context in mind from the
+start.
+
+## Implementation
+
+With these three design goals in mind, the
+[OpenHFT](https://github.com/OpenHFT) libraries were an
+obvious choice due to their reliability and high performance. Earlier in
+[CASSANDRA-13983](https://issues.apache.org/jira/browse/CASSANDRA-13983)
+the [chronical queue
+library](https://github.com/OpenHFT/Chronicle-Queue) of
+OpenHFT was introduced as a BinLog utility to the Apache Cassandra code
+base. The performance of Full Query Logging (FQL) was excellent, but it only 
instrumented mutation and read query paths. It was missing a lot of critical 
data such as when queries failed, where they came from, and which user issued 
the query. The FQL was also single purpose: preferring to drop messages rather 
than delay the process (which makes sense for FQL but not for Audit Logging). 
Lastly, the FQL didn’t allow for pluggability, which would make it harder to 
adopt in the codebase for this feature. 
+
+As shown in the architecture figure below, we were able to unify the FQL 
feature with the AuditLog functionality through the AuditLogManager and 
IAuditLogger abstractions.  Using this architecture, we can support any output 
format: logs, files, databases, etc. By default, the BinAuditLogger 
implementation comes out of the box to maintain performance. Users can choose 
the custom audit logger implementation by dropping the jar file on Cassandra 
classpath and customizing with configuration options in
+[cassandra.yaml](https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L1216-L1234)
+file.
+
+---
+
+## Architecture
+

[... 122 lines stripped ...]


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to