[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty
[ https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170271#comment-16170271 ] Andrew Whang commented on CASSANDRA-13632: -- [~jasobrown] We're using old-school thrift via pycassa. I can repro in our shadow environment, but unable to in my dev using ccm + pycassa. Will let you know once I can get this error reproducible. > Digest mismatch if row is empty > --- > > Key: CASSANDRA-13632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13632 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Andrew Whang >Assignee: Andrew Whang > Fix For: 3.0.x > > > This issue is similar to CASSANDRA-12090. Quorum read queries that include a > column selector (non-wildcard) result in digest mismatch when the row is > empty (key does not exist). It seems the data serialization path checks if > rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY > flag). However, the digest serialization path does not perform this check and > includes column names. The digest comparison results in a mismatch. The > mismatch does not end up issuing a read repair mutation since the underlying > data is the same. > The mismatch on the read path ends up doubling our p99 read latency. We > discovered this issue while testing a 2.2.5 to 3.0.13 upgrade. > One thing to note is that we're using thrift, which ends up handling the > ColumnFilter differently than the CQL path. > As with CASSANDRA-12090, fixing the digest seems sensible. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty
[ https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166318#comment-16166318 ] Jason Brown commented on CASSANDRA-13632: - bq. the data serialization path checks if rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY flag). However, the digest serialization path does not perform this check and includes column names. This is correct. bq. The digest comparison results in a mismatch. This is not correct. In [{{DigestResolver}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/DigestResolver.java#L87], the coordinator gets the digest from each response, even for the data that sent back the full {{DataResponse}}. So, while there is the difference in what the data nodes send back to the coordindator, is resolves to the same digest value. [~whangsf] what indications do you have that a DigestMismatch is happening? > Digest mismatch if row is empty > --- > > Key: CASSANDRA-13632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13632 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Andrew Whang >Assignee: Andrew Whang > Fix For: 3.0.x > > > This issue is similar to CASSANDRA-12090. Quorum read queries that include a > column selector (non-wildcard) result in digest mismatch when the row is > empty (key does not exist). It seems the data serialization path checks if > rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY > flag). However, the digest serialization path does not perform this check and > includes column names. The digest comparison results in a mismatch. The > mismatch does not end up issuing a read repair mutation since the underlying > data is the same. > The mismatch on the read path ends up doubling our p99 read latency. We > discovered this issue while testing a 2.2.5 to 3.0.13 upgrade. > One thing to note is that we're using thrift, which ends up handling the > ColumnFilter differently than the CQL path. > As with CASSANDRA-12090, fixing the digest seems sensible. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty
[ https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163650#comment-16163650 ] Jason Brown commented on CASSANDRA-13632: - [~whangsf] Can you provide steps to reproduce? Also, what kind of improvement did you see when you applied your patch? Specific numbers would be great! wrt to thrift, are you using CQL over thrift, or old-school thrift a la astyanx/hector/pycassa? I tried to dig up a thrift client but it was pretty painful and I stopped. > Digest mismatch if row is empty > --- > > Key: CASSANDRA-13632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13632 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Andrew Whang >Assignee: Andrew Whang > Fix For: 3.0.x > > > This issue is similar to CASSANDRA-12090. Quorum read queries that include a > column selector (non-wildcard) result in digest mismatch when the row is > empty (key does not exist). It seems the data serialization path checks if > rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY > flag). However, the digest serialization path does not perform this check and > includes column names. The digest comparison results in a mismatch. The > mismatch does not end up issuing a read repair mutation since the underlying > data is the same. > The mismatch on the read path ends up doubling our p99 read latency. We > discovered this issue while testing a 2.2.5 to 3.0.13 upgrade. > One thing to note is that we're using thrift, which ends up handling the > ColumnFilter differently than the CQL path. > As with CASSANDRA-12090, fixing the digest seems sensible. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty
[ https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061717#comment-16061717 ] Jay Zhuang commented on CASSANDRA-13632: Hi [~whangsf], could you please help me to reproduce the problem locally, here is what I did: {noformat} CREATE KEYSPACE foo WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '3' }; CREATE TABLE foo.foo ( key int, foo int, col int, PRIMARY KEY (key, foo) ) with dclocal_read_repair_chance=0; CONSISTENCY QUORUM; INSERT INTO foo.foo (key, foo) VALUES ( 1,1); TRACING ON; SELECT * FROM foo.foo WHERE key = 1 and foo =2; {noformat} But I don't see Repair event: {noformat} cqlsh> SELECT * FROM foo.foo WHERE key = 1 and foo =2; key | foo | col -+-+- (0 rows) Tracing session: e9751120-5879-11e7-be43-dfd5ff1ad595 activity | timestamp | source| source_elapsed --++---+ Execute CQL3 query | 2017-06-24 01:10:28.914000 | 127.0.0.1 | 0 Parsing SELECT * FROM foo.foo WHERE key = 1 and foo =2; [SharedPool-Worker-1] | 2017-06-24 01:10:28.914000 | 127.0.0.1 |220 Preparing statement [SharedPool-Worker-1] | 2017-06-24 01:10:28.914000 | 127.0.0.1 |423 READ message received from /127.0.0.1 [MessagingService-Incoming-/127.0.0.1] | 2017-06-24 01:10:28.915000 | 127.0.0.2 | 40 reading digest from /127.0.0.2 [SharedPool-Worker-1] | 2017-06-24 01:10:28.915000 | 127.0.0.1 | 1115 Executing single-partition query on foo [SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 | 1123 Acquiring sstable references [SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 | 1189 Merging memtable contents [SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 | 1228 Sending READ message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 | 1326 Read 0 live and 0 tombstone cells [SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 | 1424 Executing single-partition query on foo [SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |221 REQUEST_RESPONSE message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2] | 2017-06-24 01:10:28.916000 | 127.0.0.1 | 2817 Acquiring sstable references [SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |342 Merging memtable contents [SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |402 Read 0 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |640 Read 0 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |714 Enqueuing response to /127.0.0.1 [SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |770 Sending REQUEST_RESPONSE message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] | 2017-06-24 01:10:28.916001 | 127.0.0.2 |917 Processing response from /127.0.0.2 [SharedPool-Worker-2] | 2017-06-24 01:10:28.917000 | 127.0.0.1 | 2916 Request complete | 2017-06-24 01:10:28.917133 | 127.0.0.1 | 3133 {noformat} > Digest mismatch if row is empty > --- > > Key: CASSANDRA-13632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13632 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Andrew Whang >Assignee: Andrew Whang > Fix For: 3.0.x > > > This issue is similar to CASSANDRA-12090. Quorum read queries that include a > column selector (non-wildcard) result in digest mismatch when the row is > empty (key does not exist). It seems the data serialization path checks if > rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY > flag). However, the digest serialization path does not perform this check and > includes column names.