Hi,
Below is a sample trace for a LOCAL_QUORUM query . I've changed the query
table/col names and actual node IP addresses to IP.1 and IP.coord (for the
co-ordinator node). RF=3 and we have 2 DCs. Don't we expect to see an
"IP.2" since LOCAL_QUORUM requires the co-ordinator to receive at least 2
responses? . What am i missing here?
activity
| timestamp | source
| source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2016-09-15 04:17:55.401000 | IP.coord |
0
Parsing SELECT A,B,C from T WHERE key1='K1' and key2='K2' and key3='K3'
and key4='K4'; [SharedPool-Worker-2] | 2016-09-15 04:17:55.402000 |
IP.coord | 57
Preparing
statement [SharedPool-Worker-2] | 2016-09-15 04:17:55.403000 | IP.coord |
140
reading data from /IP.1
[SharedPool-Worker-2] | 2016-09-15 04:17:55.403000 | IP.coord |
1343
Sending READ message to /IP.1
[MessagingService-Outgoing-/IP.1] | 2016-09-15 04:17:55.404000 | IP.coord |
1388
REQUEST_RESPONSE message received from /IP.1
[MessagingService-Incoming-/IP.1] | 2016-09-15 04:17:55.404000 | IP.coord |
2953
Processing response from /IP.1
[SharedPool-Worker-3] | 2016-09-15 04:17:55.404000 | IP.coord |
3001
READ message received from /IP.coord
[MessagingService-Incoming-/IP.coord] | 2016-09-15 04:17:55.405000 | IP.1 |
117
Executing single-partition query on
user_carts [SharedPool-Worker-1] | 2016-09-15 04:17:55.405000 | IP.1 |
253
Acquiring sstable
references [SharedPool-Worker-1] | 2016-09-15 04:17:55.406000 | IP.1 |
262
Merging memtable
tombstones [SharedPool-Worker-1] | 2016-09-15 04:17:55.406000 | IP.1 |
295
Bloom filter allows skipping
sstable 729 [SharedPool-Worker-1] | 2016-09-15 04:17:55.406000 | IP.1 |
341
Partition index with 0 entries found for
sstable 713 [SharedPool-Worker-1] | 2016-09-15 04:17:55.407000 | IP.1 |
411
Seeking to partition indexed section in
data file [SharedPool-Worker-1] | 2016-09-15 04:17:55.407000 | IP.1 |
414
Skipped 0/2 non-slice-intersecting sstables, included 0 due to
tombstones [SharedPool-Worker-1] | 2016-09-15 04:17:55.407000 | IP.1 |
854
Merging data from memtables and 1
sstables [SharedPool-Worker-1] | 2016-09-15 04:17:55.408000 | IP.1 |
860
Read 1 live and 1
tombstone cells [SharedPool-Worker-1] | 2016-09-15 04:17:55.408000 | IP.1 |
910
Enqueuing response to
/IP.coord [SharedPool-Worker-1] | 2016-09-15 04:17:55.408000 | IP.1 |
1051
Sending REQUEST_RESPONSE message to /IP.coord
[MessagingService-Outgoing-/IP.coord] | 2016-09-15 04:17:55.409000 | IP.1 |
1110
Request complete | 2016-09-15 04:17:55.404067 | IP.coord |
3067
Thanks,
Joseph
On Tue, Sep 20, 2016 at 3:07 AM, Nicolas Douillet <
[email protected]> wrote:
> Hi Pranay,
>
> I'll try to answer the more precisely as I can.
>
> Note that what I'm going to explain is valid only for reads, write
> requests work differently.
> I'm assuming you have only one DC.
>
> 1. The coordinator gets a list of sorted live replicas. Replicas are
> sorted by proximity.
> (I'm not sure enough how it works to explain it here, by snitch I
> guess).
>
> 2. By default *the coordinator keeps the exact list of nodes necessary*
> to ensure the desired consistency (2 nodes for RF=3),
> but, according the read repair chance provided on each column family
> (10% of the requests by default), *it might keep all the replicas* (if
> one DC).
>
> 3. The coordinator checks if enough nodes are alive before trying any
> request. If not, no need to go further.
> You'll have a slightly different error message :
>
> *Live nodes <list> do not satisfy ConsistencyLevel (2 required) *
> 4. And in substance the coordinator waits for the exact number of
> responses to achieve the consistency.
> To be more specific, the coordinator is not requesting the same to
> each involved replicas (to one or two, the closest, a full data read, and
> for the others only a digest), and is waiting for the exact number of
> responses to achieve the consistency with at least one full data present.
> (There is of course more to explain, if the digests do not match for
> example ...)
>
> So you're right when you talk about the fastest responses, but only
> under certain conditions and if additional replicas are requested.
>
>
> I'm certainly missing some points.
> Is that clear enough?
>
> --
> Nicolas
>
>
>
> Le lun. 19 sept. 2016 à 22:16, Pranay akula <[email protected]>
> a écrit :
>
>>
>>
>> i always have this doubt when a cassandra node got a read request for
>> local quorum consistency does coordinator node asks all nodes with replicas
>> in that DC for response or just the fastest responding nodes to it who's
>> count satisfy the local quorum.
>>
>> In this case RF is 3 Cassandra timeout during read query at consistency
>> LOCAL_QUORUM (2 responses were required but only 1 replica responded)....
>> does this mean coordinator asked only two replicas with fastest response
>> for data and 1 out of 2 timed out or coordinator asked all nodes with
>> replicas which means all three (3) and 2 out of 3 timed out as i only got
>> single response back.
>>
>>
>>
>> Thanks
>>
>> Pranay
>>
>