[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty

2017-09-18 Thread Andrew Whang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170271#comment-16170271
 ] 

Andrew Whang commented on CASSANDRA-13632:
--

[~jasobrown] We're using old-school thrift via pycassa. I can repro in our 
shadow environment, but unable to in my dev using ccm + pycassa. Will let you 
know once I can get this error reproducible. 

> Digest mismatch if row is empty
> ---
>
> Key: CASSANDRA-13632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13632
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Andrew Whang
>Assignee: Andrew Whang
> Fix For: 3.0.x
>
>
> This issue is similar to CASSANDRA-12090. Quorum read queries that include a 
> column selector (non-wildcard) result in digest mismatch when the row is 
> empty (key does not exist). It seems the data serialization path checks if 
> rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY 
> flag). However, the digest serialization path does not perform this check and 
> includes column names. The digest comparison results in a mismatch. The 
> mismatch does not end up issuing a read repair mutation since the underlying 
> data is the same.
> The mismatch on the read path ends up doubling our p99 read latency. We 
> discovered this issue while testing a 2.2.5 to 3.0.13 upgrade.
> One thing to note is that we're using thrift, which ends up handling the 
> ColumnFilter differently than the CQL path. 
> As with CASSANDRA-12090, fixing the digest seems sensible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty

2017-09-14 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166318#comment-16166318
 ] 

Jason Brown commented on CASSANDRA-13632:
-

bq. the data serialization path checks if rowIterator.isEmpty() and if so 
ignores column names (by setting IS_EMPTY flag). However, the digest 
serialization path does not perform this check and includes column names. 

This is correct.

bq. The digest comparison results in a mismatch.

This is not correct. In 
[{{DigestResolver}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/DigestResolver.java#L87],
 the coordinator gets the digest from each response, even for the data that 
sent back the full {{DataResponse}}. So, while there is the difference in what 
the data nodes send back to the coordindator, is resolves to the same digest 
value.

[~whangsf] what indications do you have that a DigestMismatch is happening?

> Digest mismatch if row is empty
> ---
>
> Key: CASSANDRA-13632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13632
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Andrew Whang
>Assignee: Andrew Whang
> Fix For: 3.0.x
>
>
> This issue is similar to CASSANDRA-12090. Quorum read queries that include a 
> column selector (non-wildcard) result in digest mismatch when the row is 
> empty (key does not exist). It seems the data serialization path checks if 
> rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY 
> flag). However, the digest serialization path does not perform this check and 
> includes column names. The digest comparison results in a mismatch. The 
> mismatch does not end up issuing a read repair mutation since the underlying 
> data is the same.
> The mismatch on the read path ends up doubling our p99 read latency. We 
> discovered this issue while testing a 2.2.5 to 3.0.13 upgrade.
> One thing to note is that we're using thrift, which ends up handling the 
> ColumnFilter differently than the CQL path. 
> As with CASSANDRA-12090, fixing the digest seems sensible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty

2017-09-12 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163650#comment-16163650
 ] 

Jason Brown commented on CASSANDRA-13632:
-

[~whangsf] Can you provide steps to reproduce? Also, what kind of improvement 
did you see when you applied your patch? Specific numbers would be great! 

wrt to thrift, are you using CQL over thrift, or old-school thrift a la 
astyanx/hector/pycassa? I tried to dig up a thrift client but it was pretty 
painful and I stopped.



> Digest mismatch if row is empty
> ---
>
> Key: CASSANDRA-13632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13632
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Andrew Whang
>Assignee: Andrew Whang
> Fix For: 3.0.x
>
>
> This issue is similar to CASSANDRA-12090. Quorum read queries that include a 
> column selector (non-wildcard) result in digest mismatch when the row is 
> empty (key does not exist). It seems the data serialization path checks if 
> rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY 
> flag). However, the digest serialization path does not perform this check and 
> includes column names. The digest comparison results in a mismatch. The 
> mismatch does not end up issuing a read repair mutation since the underlying 
> data is the same.
> The mismatch on the read path ends up doubling our p99 read latency. We 
> discovered this issue while testing a 2.2.5 to 3.0.13 upgrade.
> One thing to note is that we're using thrift, which ends up handling the 
> ColumnFilter differently than the CQL path. 
> As with CASSANDRA-12090, fixing the digest seems sensible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13632) Digest mismatch if row is empty

2017-06-23 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061717#comment-16061717
 ] 

Jay Zhuang commented on CASSANDRA-13632:


Hi [~whangsf], could you please help me to reproduce the problem locally, here 
is what I did:

{noformat}
CREATE KEYSPACE foo WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': '3' };
CREATE TABLE foo.foo ( key int, foo int, col int, PRIMARY KEY (key, foo) ) with 
dclocal_read_repair_chance=0;
CONSISTENCY QUORUM;
INSERT INTO foo.foo (key, foo) VALUES ( 1,1);
TRACING ON;
SELECT * FROM foo.foo WHERE key = 1 and foo =2;
{noformat}

But I don't see Repair event:
{noformat}
cqlsh> SELECT * FROM foo.foo WHERE key = 1 and foo =2;

 key | foo | col
-+-+-

(0 rows)

Tracing session: e9751120-5879-11e7-be43-dfd5ff1ad595

 activity   
  | timestamp  | source| source_elapsed
--++---+
   Execute 
CQL3 query | 2017-06-24 01:10:28.914000 | 127.0.0.1 |  0
Parsing SELECT * FROM foo.foo WHERE key = 1 and foo =2; 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.914000 | 127.0.0.1 |220
Preparing statement 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.914000 | 127.0.0.1 |423
 READ message received from /127.0.0.1 
[MessagingService-Incoming-/127.0.0.1] | 2017-06-24 01:10:28.915000 | 127.0.0.2 
| 40
 reading digest from /127.0.0.2 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.915000 | 127.0.0.1 |   1115
Executing single-partition query on foo 
[SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 |   1123
   Acquiring sstable references 
[SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 |   1189
  Merging memtable contents 
[SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 |   1228
Sending READ message to /127.0.0.2 
[MessagingService-Outgoing-/127.0.0.2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 
|   1326
  Read 0 live and 0 tombstone cells 
[SharedPool-Worker-2] | 2017-06-24 01:10:28.915000 | 127.0.0.1 |   1424
Executing single-partition query on foo 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |221
 REQUEST_RESPONSE message received from /127.0.0.2 
[MessagingService-Incoming-/127.0.0.2] | 2017-06-24 01:10:28.916000 | 127.0.0.1 
|   2817
   Acquiring sstable references 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |342
  Merging memtable contents 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |402
  Read 0 live and 0 tombstone cells 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |640
  Read 0 live and 0 tombstone cells 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |714
   Enqueuing response to /127.0.0.1 
[SharedPool-Worker-1] | 2017-06-24 01:10:28.916000 | 127.0.0.2 |770
Sending REQUEST_RESPONSE message to /127.0.0.1 
[MessagingService-Outgoing-/127.0.0.1] | 2017-06-24 01:10:28.916001 | 127.0.0.2 
|917
Processing response from /127.0.0.2 
[SharedPool-Worker-2] | 2017-06-24 01:10:28.917000 | 127.0.0.1 |   2916
 
Request complete | 2017-06-24 01:10:28.917133 | 127.0.0.1 |   3133
{noformat}

> Digest mismatch if row is empty
> ---
>
> Key: CASSANDRA-13632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13632
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Andrew Whang
>Assignee: Andrew Whang
> Fix For: 3.0.x
>
>
> This issue is similar to CASSANDRA-12090. Quorum read queries that include a 
> column selector (non-wildcard) result in digest mismatch when the row is 
> empty (key does not exist). It seems the data serialization path checks if 
> rowIterator.isEmpty() and if so ignores column names (by setting IS_EMPTY 
> flag). However, the digest serialization path does not perform this check and 
> includes column names.