[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

Ariel Weisberg (JIRA) Wed, 01 Nov 2017 10:50:21 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ariel Weisberg updated CASSANDRA-13983:
---------------------------------------
    Status: Patch Available  (was: Open)

|[code|https://github.com/apache/cassandra/pull/169]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13983-trunk]|[dtests|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/405/]|

> Support a means of logging all queries as they were invoked
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-13983
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: CQL, Observability, Testing, Tools
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>            Priority: Major
>             Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

Reply via email to