Re: Trace evidence for LOCAL_QUORUM ending up in remote DC

Tyler Hobbs Tue, 08 Sep 2015 09:30:42 -0700

See https://issues.apache.org/jira/browse/CASSANDRA-9753


On Tue, Sep 8, 2015 at 10:22 AM, Tom van den Berge <
tom.vandenbe...@gmail.com> wrote:

> I've been bugging you a few times, but now I've got trace data for a query
> with LOCAL_QUORUM that is being sent to a remove data center.
>
> The setup is as follows:
> NetworkTopologyStrategy: {"DC1":"1","DC2":"2"}
> Both DC1 and DC2 have 2 nodes.
> In DC2, one node is currently being rebuilt, and therefore does not
> contain all data (yet).
>
> The client app connects to a node in DC1, and sends a SELECT query with CL
> LOCAL_QUORUM, which in this case means ((1/2)+1=1.
> If all is ok, the query always produces a result, because the requested
> rows are guaranteed to be available in DC1.
>
> However, the query sometimes produces no result. I've been able to record
> the traces of these queries, and it turns out that the coordinator node in
> DC1 sometimes sends the query to DC2, to the node that is being rebuilt,
> and does not have the requested rows. I've included an example trace below.
>
> The coordinator node is 10.55.156.67, which is in DC1. The 10.88.4.194 node
> is in DC2.
> I've verified that the  CL=LOCAL_QUORUM by printing it when the query is
> sent (I'm using the datastax java driver).
>
>  activity
>  | source       | source_elapsed | thread
>
> ---------------------------------------------------------------------------+--------------+----------------+-----------------------------------------
>                                        Message received from /10.55.156.67
> |  10.88.4.194 |             48 | MessagingService-Incoming-/10.55.156.67
>                              Executing single-partition query on aggregate
> |  10.88.4.194 |            286 |                     SharedPool-Worker-2
>                                               Acquiring sstable references
> |  10.88.4.194 |            306 |                     SharedPool-Worker-2
>                                                Merging memtable tombstones
> |  10.88.4.194 |            321 |                     SharedPool-Worker-2
>                         Partition index lookup allows skipping sstable 107
> |  10.88.4.194 |            458 |                     SharedPool-Worker-2
>                                     Bloom filter allows skipping sstable 1
> |  10.88.4.194 |            489 |                     SharedPool-Worker-2
>  Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones
> |  10.88.4.194 |            496 |                     SharedPool-Worker-2
>                                 Merging data from memtables and 0 sstables
> |  10.88.4.194 |            500 |                     SharedPool-Worker-2
>                                          Read 0 live and 0 tombstone cells
> |  10.88.4.194 |            513 |                     SharedPool-Worker-2
>                                        Enqueuing response to /10.55.156.67
> |  10.88.4.194 |            613 |                     SharedPool-Worker-2
>                                           Sending message to /10.55.156.67
> |  10.88.4.194 |            672 | MessagingService-Outgoing-/10.55.156.67
>                 Parsing SELECT * FROM Aggregate WHERE type=? AND typeId=?;
> | 10.55.156.67 |             10 |                     SharedPool-Worker-4
>                                            Sending message to /10.88.4.194
> | 10.55.156.67 |           4335 |  MessagingService-Outgoing-/10.88.4.194
>                                         Message received from /10.88.4.194
> | 10.55.156.67 |           6328 |  MessagingService-Incoming-/10.88.4.194
>                                Seeking to partition beginning in data file
> | 10.55.156.67 |          10417 |                     SharedPool-Worker-3
>                                              Key cache hit for sstable 389
> | 10.55.156.67 |          10586 |                     SharedPool-Worker-3
>
> My question is: how is it possible that the query is sent to a node in
> DC2?
> Since DC1 has 2 nodes and RF 1, the query should always be sent to the
> other node in DC1 if the coordinator does not have a replica, right?
>
> Thanks,
> Tom
>
>
>
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Trace evidence for LOCAL_QUORUM ending up in remote DC

Reply via email to