[jira] [Comment Edited] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-18 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198173#comment-17198173
 ] 

ZhaoYang edited comment on CASSANDRA-16036 at 9/18/20, 7:16 AM:


!16036_128mb.png!

 

Above is write perf in mixed-read-write test using 128mb cache between 
16036-disable-chunk-cache and its base line. Disabling chunk cache 
significantly improves latency.  (read perf is similar to write perf)

 

 

!15229_128mb.png!

Above is write perf in mixed-read-write test using 128mb cache between 
15229-disable-chunk-cache and 15229-improved-buffer-pool. Disabling chunk cache 
show some improvement on latency.  (read perf is similar to write perf)


was (Author: jasonstack):
!16036_128mb.png!

 

Above is write perf in mixed-read-write test between 16036-disable-chunk-cache 
and its base line. Disabling chunk cache significantly improves latency.  (read 
perf is similar to write perf)

 

 

!15229_128mb.png!

Above is write perf in mixed-read-write test between 15229-disable-chunk-cache 
and 15229-improved-buffer-pool. Disabling chunk cache show some improvement on 
latency.  (read perf is similar to write perf)

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-18 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198175#comment-17198175
 ] 

ZhaoYang commented on CASSANDRA-16036:
--

+1 to disable chunk cache until we get CASSANDRA-15229 and other improvements 
(eg. fixed buffer size) into chunk cache.

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-18 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198173#comment-17198173
 ] 

ZhaoYang commented on CASSANDRA-16036:
--

!16036_128mb.png!

 

Above is write perf in mixed-read-write test between 16036-disable-chunk-cache 
and its base line. Disabling chunk cache significantly improves latency.  (read 
perf is similar to write perf)

 

 

!15229_128mb.png!

Above is write perf in mixed-read-write test between 15229-disable-chunk-cache 
and 15229-improved-buffer-pool. Disabling chunk cache show some improvement on 
latency.  (read perf is similar to write perf)

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-17 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16036:
-
Attachment: 16036_128mb.png
15229_128mb.png

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-16 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197107#comment-17197107
 ] 

ZhaoYang commented on CASSANDRA-16036:
--

I wonder if the chunk-cache regression is related to CASSANDRA-15229, let me 
run some tests from CASSANDRA-15229.

 

Also, I am not a committer, you may need to find one more...

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-15 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16036:
-
Reviewers: Jon Meredith, ZhaoYang  (was: Jon Meredith)

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16123) Use materialized view, CPU usage over 100%

2020-09-12 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17194804#comment-17194804
 ] 

ZhaoYang commented on CASSANDRA-16123:
--

The jstack showed HintsDispatcher is running and it may generate writes with MV.

 

{\{"Keyspace.applyInternal,line 545"}} will retry until writes timeout, 2s. So 
I don't think it will consume CPU indefinitely.

> Use materialized view, CPU usage over 100%
> --
>
> Key: CASSANDRA-16123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16123
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: chenbing
>Priority: Normal
> Attachments: image-2020-09-12-16-50-41-341.png, 
> image-2020-09-12-16-52-46-581.png, image-2020-09-12-16-55-49-813.png, 
> jstack_25640.txt
>
>
> env info:
> os: CentOS Linux release 7.4.1708
> cassandra: apache-cassandra-3.11.8
> jdk:1.8.0_261
>  
> I used materialized view,but the cpu use over 100 when not cql client 
> request, 
> My analysis process is as follows:
> 1. top
>  find pid is:25640
> 2. top -pH 25640
> !image-2020-09-12-16-52-46-581.png!
> 3. printf "%x\n" 26065
> convert  threadid 26065 ,hex value is :65d1
> 4.jstack -l 26065 > jstack_25640.txt
> find 65d1 ind jstask_25640.txt
> !image-2020-09-12-16-55-49-813.png!
>  
> 5.find in source code on org.apache.cassandra.db.Keyspace.applyInternal,line 
> 545
> I guess used cpu over 100% caused by a loop call Keyspace.applyInternal.
>  
> Everybody have any suggessted?
> The jstack_25640.txt file on Attachment.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-09-10 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193818#comment-17193818
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

Thanks for the review and feedback

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta3
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a 
> problem in legacy streaming as STATS file length didn't matter.
>  
> Ideally it will be great to make sstable STATS metadata immuta

[jira] [Updated] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-09-09 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15861:
-
Test and Documentation Plan: 
https://app.circleci.com/pipelines/github/jasonstack/cassandra/306/workflows/27e49813-d93b-49df-9722-737b932710b3
  (was: 
[https://app.circleci.com/pipelines/github/jasonstack/cassandra/301/workflows/4daf1646-77d4-4b83-8df6-5caeb73f2fe8])

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are se

[jira] [Updated] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2020-09-07 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16092:
-
Test and Documentation Plan: 
https://app.circleci.com/pipelines/github/jasonstack/cassandra/305/workflows/6c813342-2bdb-4740-8599-6a8c34ab97da
 Status: Patch Available  (was: In Progress)

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2020-09-07 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16092:
-
Change Category: Semantic  (was: Code Clarity)

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16108) Concurrent Index Memtable implementation using Trie

2020-09-07 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16108:
-
Fix Version/s: 4.x

> Concurrent Index Memtable implementation using Trie
> ---
>
> Key: CASSANDRA-16108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16108
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: ZhaoYang
>Assignee: ratcharod
>Priority: Normal
> Fix For: 4.x
>
>
> Replace existing \{{ConcurrentRadixTree}} with Trie implementation for both 
> numeric index and string index to reduce memory usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2020-09-07 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191625#comment-17191625
 ] 

ZhaoYang commented on CASSANDRA-16092:
--

I have ported [Index interface 
changes|https://github.com/apache/cassandra/pull/735] for Storage Attached 
Index:
 * {{Index#Group}} to manage lifecycle of multiple indexes that can communicate 
with each other.
 * {{Index#QueryPlan}} to provide a set of indexes that can work together for a 
given query.
 * {{Index#Searcher}} to perform actual index searching.
 * Enhanced {{SSTableFlushObserver}} to pass partition deletion, static row, 
unfilter separately.
 * Moved {{UpdateTransaction}} into {{CFS}} so that we can make sure memtable 
and index memtable are in-sync.

  cc [~adelapena] [~maedhroz]

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2020-09-07 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16092:
-
Change Category: Code Clarity
 Complexity: Normal
  Fix Version/s: 4.x
 Status: Open  (was: Triage Needed)

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2020-09-07 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16092:
-
Source Control Link: https://github.com/apache/cassandra/pull/735

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16108) Concurrent Index Memtable implementation using Trie

2020-09-06 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-16108:


 Summary: Concurrent Index Memtable implementation using Trie
 Key: CASSANDRA-16108
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16108
 Project: Cassandra
  Issue Type: New Feature
Reporter: ZhaoYang


Replace existing \{{ConcurrentRadixTree}} with Trie implementation for both 
numeric index and string index to reduce memory usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2020-09-02 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-16092:


Assignee: ZhaoYang

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-09-02 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15861:
-
Test and Documentation Plan: 
[https://app.circleci.com/pipelines/github/jasonstack/cassandra/301/workflows/4daf1646-77d4-4b83-8df6-5caeb73f2fe8]
  (was: 
[https://app.circleci.com/pipelines/github/jasonstack/cassandra/300/workflows/f41f6585-cd97-4791-abdc-a2935694948e])

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are 

[jira] [Updated] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-09-01 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15861:
-
Test and Documentation Plan: 
[https://app.circleci.com/pipelines/github/jasonstack/cassandra/300/workflows/f41f6585-cd97-4791-abdc-a2935694948e]
  (was: 
[https://app.circleci.com/pipelines/github/jasonstack/cassandra/298/workflows/4e56c8b2-a998-4785-9daf-0ceee52a9a83])

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are 

[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-09-01 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188976#comment-17188976
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

Pushed a unit test to verify "compressionMetadata" is used to calculate the 
transferred size for compressed sstable.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a 
> problem in legacy streaming as STATS file length d

[jira] [Updated] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-09-01 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15861:
-
Test and Documentation Plan: 
[https://app.circleci.com/pipelines/github/jasonstack/cassandra/298/workflows/4e56c8b2-a998-4785-9daf-0ceee52a9a83]
  (was: https://circleci.com/workflow-run/610e8169-e60c-420b-a556-4120967db6cb)

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a

[jira] [Created] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2020-09-01 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-16092:


 Summary: Add Index Group Interface for Storage Attached Index
 Key: CASSANDRA-16092
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
 Project: Cassandra
  Issue Type: New Feature
  Components: Feature/SASI
Reporter: ZhaoYang


[Index 
group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
 interface allows:
* indexes on the same table to receive centralized lifecycle events called 
secondary index groups. Sharing of data between multiple column indexes on the 
same table allows SAI disk usage to realise significant space savings over 
other index implementations.
* index-group to analyze user query and provide a query plan that leverages all 
available indexes within the group.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-31 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188178#comment-17188178
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 9/1/20, 6:46 AM:
---

bq. Could you elaborate on what needs to be changed specifically in mine code 
so it will be fully ok again?

hmm.. some moved codes in {{CassandraOutgoingFiles}}. What are you worried 
about?

bq. Are you already sure that your changes are computing sizes in both 
compressed and uncompressed paths right?

this patch is for zero-copy-streaming to avoid partial written files, not about 
how size is calculated.

The change in {{CassandraStreamHeder}} around compressed size is to restore 
original behavior (reduce GC) before storege-engine refactoring.


was (Author: jasonstack):
bq. Could you elaborate on what needs to be changed specifically in mine code 
so it will be fully ok again?

hmm.. some moved codes in {{CassandraOutgoingFiles}}. What are you worried 
about?

bq. Are you already sure that your changes are computing sizes in both 
compressed and uncompressed paths right?

this patch is for zero-copy-streaming to avoid partial written files, not about 
how size is calculated.

The change in {{CassandraStreamHeder}} around compressed size is to restore 
original behavior before storege-engine refactoring.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> 

[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-31 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188178#comment-17188178
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 9/1/20, 6:45 AM:
---

bq. Could you elaborate on what needs to be changed specifically in mine code 
so it will be fully ok again?

hmm.. some moved codes in {{CassandraOutgoingFiles}}. What are you worried 
about?

bq. Are you already sure that your changes are computing sizes in both 
compressed and uncompressed paths right?

this patch is for zero-copy-streaming to avoid partial written files, not about 
how size is calculated.

The change in {{CassandraStreamHeder}} around compressed size is to restore 
original behavior before storege-engine refactoring.


was (Author: jasonstack):
bq. Could you elaborate on what needs to be changed specifically in mine code 
so it will be fully ok again?

hmm.. some moved codes in {{CassandraOutgoingFiles}}. What are you worried 
about?

bq. Are you already sure that your changes are computing sizes in both 
compressed and uncompressed paths right?

this patch is for zero-copy-streaming to avoid partial written files, not about 
how size is calculated.

The change in {{ComponentMetadata}} around compressed size is to restore 
original behavior before storege-engine refactoring.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broad

[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-31 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188178#comment-17188178
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 9/1/20, 6:44 AM:
---

bq. Could you elaborate on what needs to be changed specifically in mine code 
so it will be fully ok again?

hmm.. some moved codes in {{CassandraOutgoingFiles}}. What are you worried 
about?

bq. Are you already sure that your changes are computing sizes in both 
compressed and uncompressed paths right?

this patch is for zero-copy-streaming to avoid partial written files, not about 
how size is calculated.

The change in {{ComponentMetadata}} around compressed size is to restore 
original behavior before storege-engine refactoring.


was (Author: jasonstack):
bq. Could you elaborate on what needs to be changed specifically in mine code 
so it will be fully ok again?

hmm.. some moved codes in {{CassandraOutgoingFiles}}. What are you worried 
about?

bq. Are you already sure that your changes are computing sizes in both 
compressed and uncompressed paths right?

this patch is for zero-copy-streaming to avoid partial written files, not about 
how size is calculated.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-messag

[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-31 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188178#comment-17188178
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

bq. Could you elaborate on what needs to be changed specifically in mine code 
so it will be fully ok again?

hmm.. some moved codes in {{CassandraOutgoingFiles}}. What are you worried 
about?

bq. Are you already sure that your changes are computing sizes in both 
compressed and uncompressed paths right?

this patch is for zero-copy-streaming to avoid partial written files, not about 
how size is calculated.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{C

[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-31 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188166#comment-17188166
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

[~stefan.miklosovic] I took a brief look at the patch in 15406, I think there 
are just minor superficial conflicts, no compatibility issue.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a 
> problem in legacy streaming

[jira] [Updated] (CASSANDRA-16076) Batch schema statements to create multiple SASI and MV at once to reduce disk IO

2020-08-26 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16076:
-
Summary: Batch schema statements to create multiple SASI and MV at once to 
reduce disk IO  (was: Batch schema statement to create multiple SASI and MV at 
once to reduce disk IO)

> Batch schema statements to create multiple SASI and MV at once to reduce disk 
> IO
> 
>
> Key: CASSANDRA-16076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16076
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Materialized Views, Feature/SASI
>Reporter: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> Currently, operator has to create SASI/MV one by one on the same table and 
> every index/view build will need to read all data on disk.
> In order to speed up multiple SASI/MV creation, I propose to add a new batch 
> schema statement to create multiple SASI/MV at once, so that C* only needs to 
> read on-disk data once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16076) Batch schema statement to create multiple SASI and MV at once to reduce disk IO

2020-08-26 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16076:
-
Summary: Batch schema statement to create multiple SASI and MV at once to 
reduce disk IO  (was: Batch schema statement to multiple SASI and MV at once to 
reduce disk IO)

> Batch schema statement to create multiple SASI and MV at once to reduce disk 
> IO
> ---
>
> Key: CASSANDRA-16076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16076
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Materialized Views, Feature/SASI
>Reporter: ZhaoYang
>Priority: Normal
>
> Currently, operator has to create SASI/MV one by one on the same table and 
> every index/view build will need to read all data on disk.
> In order to speed up multiple SASI/MV creation, I propose to add a new batch 
> schema statement to create multiple SASI/MV at once, so that C* only needs to 
> read on-disk data once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16076) Batch schema statement to multiple SASI and MV at once to reduce disk IO

2020-08-26 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-16076:


 Summary: Batch schema statement to multiple SASI and MV at once to 
reduce disk IO
 Key: CASSANDRA-16076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16076
 Project: Cassandra
  Issue Type: New Feature
  Components: Feature/Materialized Views, Feature/SASI
Reporter: ZhaoYang


Currently, operator has to create SASI/MV one by one on the same table and 
every index/view build will need to read all data on disk.

In order to speed up multiple SASI/MV creation, I propose to add a new batch 
schema statement to create multiple SASI/MV at once, so that C* only needs to 
read on-disk data once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16076) Batch schema statement to create multiple SASI and MV at once to reduce disk IO

2020-08-26 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16076:
-
Fix Version/s: 4.x

> Batch schema statement to create multiple SASI and MV at once to reduce disk 
> IO
> ---
>
> Key: CASSANDRA-16076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16076
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Materialized Views, Feature/SASI
>Reporter: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> Currently, operator has to create SASI/MV one by one on the same table and 
> every index/view build will need to read all data on disk.
> In order to speed up multiple SASI/MV creation, I propose to add a new batch 
> schema statement to create multiple SASI/MV at once, so that C* only needs to 
> read on-disk data once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-26 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185031#comment-17185031
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

[~bdeggleston] I have restored previous commits, sorry for the trouble

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a 
> problem in legacy streaming as STATS file length didn't matter.
>  
> Ideally it will be great to

[jira] [Commented] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-25 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184111#comment-17184111
 ] 

ZhaoYang commented on CASSANDRA-16071:
--

Thanks for the patch. LGTM.

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:57 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb" (153GB)}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes"}} and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb" (153GB)}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:53 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb" (153GB)}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:52 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
{{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :

{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:50 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
{{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :

{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang commented on CASSANDRA-16071:
--

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16071:
-
Test and Documentation Plan: CI running
 Status: Patch Available  (was: In Progress)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16071:
-
Reviewers: ZhaoYang

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-21 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181851#comment-17181851
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

updated the patch based on caleb's builder approach, now 
dfile/ifile/bf/indexSummary are all final.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a 
> problem in legacy streaming as STATS file length didn't matter.
>  

[jira] [Updated] (CASSANDRA-16052) CEP-7 Storage Attached Index for Apache Cassandra

2020-08-17 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16052:
-
Summary: CEP-7 Storage Attached Index for Apache Cassandra  (was: Storage 
Attached Index for Apache Cassandra)

> CEP-7 Storage Attached Index for Apache Cassandra
> -
>
> Key: CASSANDRA-16052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16052
> Project: Cassandra
>  Issue Type: Epic
>  Components: Feature/2i Index
>Reporter: ZhaoYang
>Priority: Normal
>
> [CEP|https://docs.google.com/document/d/1V830eAMmQAspjJdjviVZIaSolVGvZ1hVsqOLWyV0DS4/edit#heading=h.67ap6rr1mxr]
>  - A new index implementation, called Storage
>  Attached Index(SAI), based on the advancement made by SASI.
>  * disk usage by sharing of common data between multiple column indexes on 
> the same table and better compression of on-disk structures.
>  * numeric range query performance with modified KDTree and collection type 
> support.
>  * compaction performance and stability for larger data set.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16052) Storage Attached Index for Apache Cassandra

2020-08-17 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-16052:


 Summary: Storage Attached Index for Apache Cassandra
 Key: CASSANDRA-16052
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16052
 Project: Cassandra
  Issue Type: Epic
  Components: Feature/2i Index
Reporter: ZhaoYang


[CEP|https://docs.google.com/document/d/1V830eAMmQAspjJdjviVZIaSolVGvZ1hVsqOLWyV0DS4/edit#heading=h.67ap6rr1mxr]
 - A new index implementation, called Storage
 Attached Index(SAI), based on the advancement made by SASI.
 * disk usage by sharing of common data between multiple column indexes on the 
same table and better compression of on-disk structures.

 * numeric range query performance with modified KDTree and collection type 
support.

 * compaction performance and stability for larger data set.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-08-12 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176756#comment-17176756
 ] 

ZhaoYang commented on CASSANDRA-16036:
--

[~dcapwell] sorry, won't be able to look deeper into chunk cache this week. But 
based on the comparison between 3.0 baseline, 4.0 chunk cache, and 4.0 no chunk 
cache, disabling chunk-cache didn't bridge the gap between 3.0 and 4.0. I 
wonder if something else is affecting the perf instead of chunk cache. do you 
have 

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-08-12 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176756#comment-17176756
 ] 

ZhaoYang edited comment on CASSANDRA-16036 at 8/13/20, 4:22 AM:


[~dcapwell] sorry, won't be able to look deeper into chunk cache this week. But 
based on the comparison between 3.0 baseline, 4.0 chunk cache, and 4.0 no chunk 
cache, disabling chunk-cache didn't bridge the gap between 3.0 and 4.0. I 
wonder if something else is affecting the perf instead of chunk cache. do you 
have JFR?


was (Author: jasonstack):
[~dcapwell] sorry, won't be able to look deeper into chunk cache this week. But 
based on the comparison between 3.0 baseline, 4.0 chunk cache, and 4.0 no chunk 
cache, disabling chunk-cache didn't bridge the gap between 3.0 and 4.0. I 
wonder if something else is affecting the perf instead of chunk cache. do you 
have 

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-11 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15861:
-
Test and Documentation Plan: 
https://circleci.com/workflow-run/610e8169-e60c-420b-a556-4120967db6cb  (was: 
https://circleci.com/workflow-run/9e2af3a1-7b63-423d-8cde-d2cd178c81d6)

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a 
> problem in legacy streaming as STATS file

[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-11 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175412#comment-17175412
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 8/11/20, 10:48 AM:
-

bq. 1) Orphaned hard links need to be cleaned up on startup.

If the hard links end with `.tmp`, they will be cleaned up on startup by 
{{StartupChecks#checkSystemKeyspaceState}}

bq. 2) Using the streaming session id for the hard link name, instead of a time 
uuid, would make debugging some issues easier.

I think the same streaming plan id is used by different peers. It may fail to 
create hardlink when streaming the same sstables to different peers in the same 
stream plan. 

bq. We could leave ComponentManifest the way it was before this patch and have 
a separate class, let's call it ComponentContext, that embeds it.

+1

bq. In this case, if you could guarantee that no more than 1 index resample can 
happen at once for a given sstable, the only thing you'd need to synchronize in 
`cloneWithNewSummarySamplingLevel` is `saveSummary`. If you did that, you could 
just synchronize hard link creation on `tidy.global`, instead of introducing a 
new lock.

Agreed with caleb, no more than 1 index resample can happen concurrently for a 
given sstable as sstable is marked as compacting before resampling.

bq. That leaves indexSummary, which perhaps we cold make volatile, and all the 
state used in cloneAndReplace()...but we could just extend the synchronized 
(tidy.global) block to include the latter. Nothing expensive happens inside 
cloneAndReplace(), AFAICT.

good idea

bq. synchronized (tidy.global)

The old approach was to synchronize entire streaming phase, so I didn't use 
"synchronized (tidy.global)" which may block concurrent compactions. 

But now only hard-link creation is synchronized, using "synchronized 
(tidy.global)" is better than introducing a new lock.



was (Author: jasonstack):
bq. 1) Orphaned hard links need to be cleaned up on startup.

If the hard links end with `.tmp`, they will be cleaned up on startup by 
{{StartupChecks#checkSystemKeyspaceState}}

bq. 2) Using the streaming session id for the hard link name, instead of a time 
uuid, would make debugging some issues easier.

+1

bq. We could leave ComponentManifest the way it was before this patch and have 
a separate class, let's call it ComponentContext, that embeds it.

+1

bq. In this case, if you could guarantee that no more than 1 index resample can 
happen at once for a given sstable, the only thing you'd need to synchronize in 
`cloneWithNewSummarySamplingLevel` is `saveSummary`. If you did that, you could 
just synchronize hard link creation on `tidy.global`, instead of introducing a 
new lock.

Agreed with caleb, no more than 1 index resample can happen concurrently for a 
given sstable as sstable is marked as compacting before resampling.

bq. That leaves indexSummary, which perhaps we cold make volatile, and all the 
state used in cloneAndReplace()...but we could just extend the synchronized 
(tidy.global) block to include the latter. Nothing expensive happens inside 
cloneAndReplace(), AFAICT.

good idea

bq. synchronized (tidy.global)

The old approach was to synchronize entire streaming phase, so I didn't use 
"synchronized (tidy.global)" which may block concurrent compactions. 

But now only hard-link creation is synchronized, using "synchronized 
(tidy.global)" is better than introducing a new lock.


> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Stat

[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-11 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175412#comment-17175412
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 8/11/20, 9:47 AM:


bq. 1) Orphaned hard links need to be cleaned up on startup.

If the hard links end with `.tmp`, they will be cleaned up on startup by 
{{StartupChecks#checkSystemKeyspaceState}}

bq. 2) Using the streaming session id for the hard link name, instead of a time 
uuid, would make debugging some issues easier.

+1

bq. We could leave ComponentManifest the way it was before this patch and have 
a separate class, let's call it ComponentContext, that embeds it.

+1

bq. In this case, if you could guarantee that no more than 1 index resample can 
happen at once for a given sstable, the only thing you'd need to synchronize in 
`cloneWithNewSummarySamplingLevel` is `saveSummary`. If you did that, you could 
just synchronize hard link creation on `tidy.global`, instead of introducing a 
new lock.

Agreed with caleb, no more than 1 index resample can happen concurrently for a 
given sstable as sstable is marked as compacting before resampling.

bq. That leaves indexSummary, which perhaps we cold make volatile, and all the 
state used in cloneAndReplace()...but we could just extend the synchronized 
(tidy.global) block to include the latter. Nothing expensive happens inside 
cloneAndReplace(), AFAICT.

good idea

bq. synchronized (tidy.global)

The old approach was to synchronized entire streaming phase, so I didn't use 
"synchronized (tidy.global)" which may block concurrent compactions. 

But now only hard-link creation is synchronized, using "synchronized 
(tidy.global)" is better than introducing a new lock.



was (Author: jasonstack):
bq. 1) Orphaned hard links need to be cleaned up on startup.

If the hard links end with `.tmp`, they will be cleaned up on startup by 
{{StartupChecks#checkSystemKeyspaceState}}

bq. 2) Using the streaming session id for the hard link name, instead of a time 
uuid, would make debugging some issues easier.

+1

bq. We could leave ComponentManifest the way it was before this patch and have 
a separate class, let's call it ComponentContext, that embeds it.

+1

bq. In this case, if you could guarantee that no more than 1 index resample can 
happen at once for a given sstable, the only thing you'd need to synchronize in 
`cloneWithNewSummarySamplingLevel` is `saveSummary`. If you did that, you could 
just synchronize hard link creation on `tidy.global`, instead of introducing a 
new lock.

Agreed with caleb, no more than 1 index resample can happen concurrently for a 
given sstable as sstable is marked as compacting before resampling.

bq. That leaves indexSummary, which perhaps we cold make volatile, and all the 
state used in cloneAndReplace()...but we could just extend the synchronized 
(tidy.global) block to include the latter. Nothing expensive happens inside 
cloneAndReplace(), AFAICT.

good idea

bq. synchronized (tidy.global)




> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.muta

[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-11 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175412#comment-17175412
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 8/11/20, 9:47 AM:


bq. 1) Orphaned hard links need to be cleaned up on startup.

If the hard links end with `.tmp`, they will be cleaned up on startup by 
{{StartupChecks#checkSystemKeyspaceState}}

bq. 2) Using the streaming session id for the hard link name, instead of a time 
uuid, would make debugging some issues easier.

+1

bq. We could leave ComponentManifest the way it was before this patch and have 
a separate class, let's call it ComponentContext, that embeds it.

+1

bq. In this case, if you could guarantee that no more than 1 index resample can 
happen at once for a given sstable, the only thing you'd need to synchronize in 
`cloneWithNewSummarySamplingLevel` is `saveSummary`. If you did that, you could 
just synchronize hard link creation on `tidy.global`, instead of introducing a 
new lock.

Agreed with caleb, no more than 1 index resample can happen concurrently for a 
given sstable as sstable is marked as compacting before resampling.

bq. That leaves indexSummary, which perhaps we cold make volatile, and all the 
state used in cloneAndReplace()...but we could just extend the synchronized 
(tidy.global) block to include the latter. Nothing expensive happens inside 
cloneAndReplace(), AFAICT.

good idea

bq. synchronized (tidy.global)

The old approach was to synchronize entire streaming phase, so I didn't use 
"synchronized (tidy.global)" which may block concurrent compactions. 

But now only hard-link creation is synchronized, using "synchronized 
(tidy.global)" is better than introducing a new lock.



was (Author: jasonstack):
bq. 1) Orphaned hard links need to be cleaned up on startup.

If the hard links end with `.tmp`, they will be cleaned up on startup by 
{{StartupChecks#checkSystemKeyspaceState}}

bq. 2) Using the streaming session id for the hard link name, instead of a time 
uuid, would make debugging some issues easier.

+1

bq. We could leave ComponentManifest the way it was before this patch and have 
a separate class, let's call it ComponentContext, that embeds it.

+1

bq. In this case, if you could guarantee that no more than 1 index resample can 
happen at once for a given sstable, the only thing you'd need to synchronize in 
`cloneWithNewSummarySamplingLevel` is `saveSummary`. If you did that, you could 
just synchronize hard link creation on `tidy.global`, instead of introducing a 
new lock.

Agreed with caleb, no more than 1 index resample can happen concurrently for a 
given sstable as sstable is marked as compacting before resampling.

bq. That leaves indexSummary, which perhaps we cold make volatile, and all the 
state used in cloneAndReplace()...but we could just extend the synchronized 
(tidy.global) block to include the latter. Nothing expensive happens inside 
cloneAndReplace(), AFAICT.

good idea

bq. synchronized (tidy.global)

The old approach was to synchronized entire streaming phase, so I didn't use 
"synchronized (tidy.global)" which may block concurrent compactions. 

But now only hard-link creation is synchronized, using "synchronized 
(tidy.global)" is better than introducing a new lock.


> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io

[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-08-11 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175412#comment-17175412
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

bq. 1) Orphaned hard links need to be cleaned up on startup.

If the hard links end with `.tmp`, they will be cleaned up on startup by 
{{StartupChecks#checkSystemKeyspaceState}}

bq. 2) Using the streaming session id for the hard link name, instead of a time 
uuid, would make debugging some issues easier.

+1

bq. We could leave ComponentManifest the way it was before this patch and have 
a separate class, let's call it ComponentContext, that embeds it.

+1

bq. In this case, if you could guarantee that no more than 1 index resample can 
happen at once for a given sstable, the only thing you'd need to synchronize in 
`cloneWithNewSummarySamplingLevel` is `saveSummary`. If you did that, you could 
just synchronize hard link creation on `tidy.global`, instead of introducing a 
new lock.

Agreed with caleb, no more than 1 index resample can happen concurrently for a 
given sstable as sstable is marked as compacting before resampling.

bq. That leaves indexSummary, which perhaps we cold make volatile, and all the 
state used in cloneAndReplace()...but we could just extend the synchronized 
(tidy.global) block to include the latter. Nothing expensive happens inside 
cloneAndReplace(), AFAICT.

good idea

bq. synchronized (tidy.global)




> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. 

[jira] [Updated] (CASSANDRA-16044) Query SSTable Indexes lazily in token sorted runs for LCS, TWCS or RangeAwaredCompaction

2020-08-11 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16044:
-
Fix Version/s: 4.x

> Query SSTable Indexes lazily in token sorted runs for LCS, TWCS or 
> RangeAwaredCompaction
> 
>
> Key: CASSANDRA-16044
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16044
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> Currently SASI searches all SSTable indexes that may include the query 
> partition key and indexed term, but this will cause large IO overhead with 
> range index query (ie. age > 18) when sstable count is huge.
> Proposed improvement: query sstable indexes in token-sorted-runs lazily. When 
> the data in the first few token ranges is sufficient for limit, SASI can 
> reduce the overhead of searching sstable indexes for the remaining ranges.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16044) Query SSTable Indexes lazily in token sorted runs for LCS, TWCS or RangeAwaredCompaction

2020-08-11 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16044:
-
Summary: Query SSTable Indexes lazily in token sorted runs for LCS, TWCS or 
RangeAwaredCompaction  (was: Query SSTable Indexes in token sorted runs for LCS 
and TWCS)

> Query SSTable Indexes lazily in token sorted runs for LCS, TWCS or 
> RangeAwaredCompaction
> 
>
> Key: CASSANDRA-16044
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16044
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Priority: Normal
>
> Currently SASI searches all SSTable indexes that may include the query 
> partition key and indexed term, but this will cause large IO overhead with 
> range index query (ie. age > 18) when sstable count is huge.
> Proposed improvement: query sstable indexes in token-sorted-runs lazily. When 
> the data in the first few token ranges is sufficient for limit, SASI can 
> reduce the overhead of searching sstable indexes for the remaining ranges.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16044) Query SSTable Indexes in token sorted runs for LCS and TWCS

2020-08-11 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-16044:


 Summary: Query SSTable Indexes in token sorted runs for LCS and 
TWCS
 Key: CASSANDRA-16044
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16044
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/SASI
Reporter: ZhaoYang


Currently SASI searches all SSTable indexes that may include the query 
partition key and indexed term, but this will cause large IO overhead with 
range index query (ie. age > 18) when sstable count is huge.

Proposed improvement: query sstable indexes in token-sorted-runs lazily. When 
the data in the first few token ranges is sufficient for limit, SASI can reduce 
the overhead of searching sstable indexes for the remaining ranges.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-07-29 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167210#comment-17167210
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

[~maedhroz] thanks for the feedback. I have squashed and pushed.

 
There are two types of concurrent component mutations.
* index summary redistribution compaction - deletes index summary and write a 
new one
* pending repair manager's RepairFinishedCompactionTask - atomic replace old 
stats with new stats file (delete and rewrite on Windows).

In order to avoid streaming mismatched ComponentManifest and files, now 
manifest will create hard links
on the mutatable components and stream the hard-linked files instead of the 
original files which may have been modified.

To prevent creating hard links on partially written index summary or stats file 
in Windows OS, a read lock is
needed to create hard links and write lock is needed for saving index summary 
and stats metadata.

With this approach, only saving index summary may block entire-sstable 
streaming but index summary redistribution is not very frequent. We can get rid 
of the blocking by writing index summary to a temp file and replace the old 
summary atomically.
(Note: atomic replace doesn't work on Windows, so we have to delete first)


> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants i

[jira] [Commented] (CASSANDRA-15665) StreamManager should clearly differentiate between "initiator" and "receiver" sessions

2020-07-24 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164532#comment-17164532
 ] 

ZhaoYang commented on CASSANDRA-15665:
--

the version barrier is defined in {{MessagingService.accept_streaming}}
 

> StreamManager should clearly differentiate between "initiator" and "receiver" 
> sessions
> --
>
> Key: CASSANDRA-15665
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15665
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta1
>
>
> {{StreamManager}} does currently a suboptimal job in differentiating between 
> stream sessions (in form of {{StreamResultFuture}}) which have been either 
> initiated or "received", for the following reasons:
> 1) Naming is IMO confusing: a "receiver" session could actually both send and 
> receive files, so technically an initiator is also a receiver.
> 2) {{StreamManager#findSession()}}  assumes we should first looking into 
> "initiator" sessions, then into "receiver" ones: this is a dangerous 
> assumptions, in particular for test environments where the same process could 
> work as both an initiator and a receiver.
> I would recommend the following changes:
> 1) Rename "receiver" with "follower" everywhere the former is used.
> 2) Introduce a new flag into {{StreamMessageHeader}} to signal if the message 
> comes from an initiator or follower session, in order to correctly 
> differentiate and look for sessions in {{StreamManager}}.
> While my arguments above might seem trivial, I believe they will improve 
> clarity and save from potential bugs/headaches at testing time, and doing 
> such changes now that we're revamping streaming for 4.0 seems the right time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15665) StreamManager should clearly differentiate between "initiator" and "receiver" sessions

2020-07-23 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164075#comment-17164075
 ] 

ZhaoYang edited comment on CASSANDRA-15665 at 7/24/20, 1:57 AM:


[~maedhroz] does it fail anything? I think we don't allow cross-version 
streaming between 3.x and 4.0..It's guarded by version when establishing 
connections.


was (Author: jasonstack):
[~maedhroz] does it fail anything? I think we don't allow cross-version 
streaming between 3.x and 4.0..

> StreamManager should clearly differentiate between "initiator" and "receiver" 
> sessions
> --
>
> Key: CASSANDRA-15665
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15665
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta1
>
>
> {{StreamManager}} does currently a suboptimal job in differentiating between 
> stream sessions (in form of {{StreamResultFuture}}) which have been either 
> initiated or "received", for the following reasons:
> 1) Naming is IMO confusing: a "receiver" session could actually both send and 
> receive files, so technically an initiator is also a receiver.
> 2) {{StreamManager#findSession()}}  assumes we should first looking into 
> "initiator" sessions, then into "receiver" ones: this is a dangerous 
> assumptions, in particular for test environments where the same process could 
> work as both an initiator and a receiver.
> I would recommend the following changes:
> 1) Rename "receiver" with "follower" everywhere the former is used.
> 2) Introduce a new flag into {{StreamMessageHeader}} to signal if the message 
> comes from an initiator or follower session, in order to correctly 
> differentiate and look for sessions in {{StreamManager}}.
> While my arguments above might seem trivial, I believe they will improve 
> clarity and save from potential bugs/headaches at testing time, and doing 
> such changes now that we're revamping streaming for 4.0 seems the right time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15665) StreamManager should clearly differentiate between "initiator" and "receiver" sessions

2020-07-23 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164075#comment-17164075
 ] 

ZhaoYang commented on CASSANDRA-15665:
--

[~maedhroz] does it fail anything? I think we don't allow cross-version 
streaming between 3.x and 4.0..

> StreamManager should clearly differentiate between "initiator" and "receiver" 
> sessions
> --
>
> Key: CASSANDRA-15665
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15665
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta1
>
>
> {{StreamManager}} does currently a suboptimal job in differentiating between 
> stream sessions (in form of {{StreamResultFuture}}) which have been either 
> initiated or "received", for the following reasons:
> 1) Naming is IMO confusing: a "receiver" session could actually both send and 
> receive files, so technically an initiator is also a receiver.
> 2) {{StreamManager#findSession()}}  assumes we should first looking into 
> "initiator" sessions, then into "receiver" ones: this is a dangerous 
> assumptions, in particular for test environments where the same process could 
> work as both an initiator and a receiver.
> I would recommend the following changes:
> 1) Rename "receiver" with "follower" everywhere the former is used.
> 2) Introduce a new flag into {{StreamMessageHeader}} to signal if the message 
> comes from an initiator or follower session, in order to correctly 
> differentiate and look for sessions in {{StreamManager}}.
> While my arguments above might seem trivial, I believe they will improve 
> clarity and save from potential bugs/headaches at testing time, and doing 
> such changes now that we're revamping streaming for 4.0 seems the right time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-07-23 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163769#comment-17163769
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 7/23/20, 5:29 PM:


Updated the patch to load stats component into memory, so that entire-sstable 
streaming will not block LCS and incremental repair..

 

If we want to reduce the blocking time for index summary redistribution, we can 
consider:
 * writing new index summary to a temp file and replacing the old file 
atomically; at the beginning of streaming, open all file channel instances 
which still point to the old files (this is file system dependent).
 * writing new index summary to a temp file and replacing the old file 
atomically; on the streaming side, use hard link to make sure it streams the 
same file.

WDYT?


was (Author: jasonstack):
Updated the patch to load stats component into memory, so that entire-sstable 
streaming will not block LCS and incremental repair..

 

If we want to reduce the blocking time for index summary redistribution, we can 
consider: writing new index summary to a temp file and replacing the old file 
atomically; at the beginning of streaming, open all file channel instances 
which still point to the old files (this is file system dependent). WDYT?

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-mess

[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-07-23 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163769#comment-17163769
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

Updated the patch to load stats component into memory, so that entire-sstable 
streaming will not block LCS and incremental repair..

 

If we want to reduce the blocking time for index summary redistribution, we can 
consider: writing new index summary to a temp file and replacing the old file 
atomically; at the beginning of streaming, open all file channel instances 
which still point to the old files (this is file system dependent). WDYT?

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and

[jira] [Updated] (CASSANDRA-15972) SASI should handle ReversedType when using "instanceof" on AbstractType

2020-07-22 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15972:
-
Severity: Low  (was: Normal)

> SASI should handle ReversedType when using "instanceof" on AbstractType
> ---
>
> Key: CASSANDRA-15972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15972
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Priority: Low
> Fix For: 4.x
>
>
> {code:java}
> createTable("CREATE TABLE %s (pk int, ck text, v int, primary key(pk, ck)) 
> WITH CLUSTERING ORDER BY (ck DESC);");
> createIndex("CREATE CUSTOM INDEX ON %s (ck) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS" +
> " = {'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
> 'case_sensitive': 'false'} ");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15972) SASI should handle ReversedType when using "instanceof" on AbstractType

2020-07-22 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15972:
-
Description: 
{code:java}
createTable("CREATE TABLE %s (pk int, ck text, v int, primary key(pk, ck)) WITH 
CLUSTERING ORDER BY (ck DESC);");
createIndex("CREATE CUSTOM INDEX ON %s (ck) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS" +
" = {'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'} ");
{code}

  was:
{code}
createTable("CREATE TABLE %s (pk int, ck text, v int, primary key(pk, ck)) WITH 
CLUSTERING ORDER BY (ck DESC);");
createIndex("CREATE CUSTOM INDEX ON %s (ck) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS" +
" = {'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'} ");
{code}


> SASI should handle ReversedType when using "instanceof" on AbstractType
> ---
>
> Key: CASSANDRA-15972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15972
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Priority: Normal
>
> {code:java}
> createTable("CREATE TABLE %s (pk int, ck text, v int, primary key(pk, ck)) 
> WITH CLUSTERING ORDER BY (ck DESC);");
> createIndex("CREATE CUSTOM INDEX ON %s (ck) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS" +
> " = {'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
> 'case_sensitive': 'false'} ");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15972) SASI should handle ReversedType when using "instanceof" on AbstractType

2020-07-22 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-15972:


 Summary: SASI should handle ReversedType when using "instanceof" 
on AbstractType
 Key: CASSANDRA-15972
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15972
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/SASI
Reporter: ZhaoYang


{code}
createTable("CREATE TABLE %s (pk int, ck text, v int, primary key(pk, ck)) WITH 
CLUSTERING ORDER BY (ck DESC);");
createIndex("CREATE CUSTOM INDEX ON %s (ck) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS" +
" = {'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'} ");
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15972) SASI should handle ReversedType when using "instanceof" on AbstractType

2020-07-22 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15972:
-
 Bug Category: Parent values: Code(13163)
   Complexity: Normal
Discovered By: Code Inspection
Fix Version/s: 4.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> SASI should handle ReversedType when using "instanceof" on AbstractType
> ---
>
> Key: CASSANDRA-15972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15972
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> {code:java}
> createTable("CREATE TABLE %s (pk int, ck text, v int, primary key(pk, ck)) 
> WITH CLUSTERING ORDER BY (ck DESC);");
> createIndex("CREATE CUSTOM INDEX ON %s (ck) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS" +
> " = {'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
> 'case_sensitive': 'false'} ");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-07-20 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15921:
-
Description: 
The main purpose of this ticket to get a better understanding about 4.0 MV 
status as a guideline for future improvements. I don't think it should block 
4.0 release since it's already marked as experimental.

Main areas to test:
 * Write perf: We expect to see [10% write throughput drop per MV 
added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
** Attached C40_MV.png  is alpha-4, 5-node, rf3 MV write tests: with 1 mv, 
throughput dropped 50%
 * Read perf: identical to normal table
 * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
 * Repair: write path required
 * Chaos monkey: take down coordinator/base-replica/view-replica during 
read/write/token-movement and verify data consistency (may need a tool)
 * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
 * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918

  was:
The main purpose of this ticket to get a better understanding about 4.0 MV 
status as a guideline for future improvements. I don't think it should block 
4.0 release since it's already marked as experimental.

Main areas to test:
 * Write perf: We expect to see [10% write throughput drop per MV 
added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
** Attached C40_MV.png  is alpha-4, 5-node, rf3 MV write tests.
 * Read perf: identical to normal table
 * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
 * Repair: write path required
 * Chaos monkey: take down coordinator/base-replica/view-replica during 
read/write/token-movement and verify data consistency (may need a tool)
 * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
 * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918


> 4.0 quality testing: Materialized View
> --
>
> Key: CASSANDRA-15921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
> Attachments: C40_MV.png
>
>
> The main purpose of this ticket to get a better understanding about 4.0 MV 
> status as a guideline for future improvements. I don't think it should block 
> 4.0 release since it's already marked as experimental.
> Main areas to test:
>  * Write perf: We expect to see [10% write throughput drop per MV 
> added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
> ** Attached C40_MV.png  is alpha-4, 5-node, rf3 MV write tests: with 1 mv, 
> throughput dropped 50%
>  * Read perf: identical to normal table
>  * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
>  * Repair: write path required
>  * Chaos monkey: take down coordinator/base-replica/view-replica during 
> read/write/token-movement and verify data consistency (may need a tool)
>  * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
>  * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-07-20 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15921:
-
Description: 
The main purpose of this ticket to get a better understanding about 4.0 MV 
status as a guideline for future improvements. I don't think it should block 
4.0 release since it's already marked as experimental.

Main areas to test:
 * Write perf: We expect to see [10% write throughput drop per MV 
added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
** Attached C40_MV.png  is alpha-4, 5-node, rf3 MV write tests.
 * Read perf: identical to normal table
 * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
 * Repair: write path required
 * Chaos monkey: take down coordinator/base-replica/view-replica during 
read/write/token-movement and verify data consistency (may need a tool)
 * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
 * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918

  was:
The main purpose of this ticket to get a better understanding about 4.0 MV 
status as a guideline for future improvements. I don't think it should block 
4.0 release since it's already marked as experimental.

Main areas to test:
 * Write perf: We expect to see [10% write throughput drop per MV 
added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
 * Read perf: identical to normal table
 * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
 * Repair: write path required
 * Chaos monkey: take down coordinator/base-replica/view-replica during 
read/write/token-movement and verify data consistency (may need a tool)
 * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
 * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918


> 4.0 quality testing: Materialized View
> --
>
> Key: CASSANDRA-15921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
> Attachments: C40_MV.png
>
>
> The main purpose of this ticket to get a better understanding about 4.0 MV 
> status as a guideline for future improvements. I don't think it should block 
> 4.0 release since it's already marked as experimental.
> Main areas to test:
>  * Write perf: We expect to see [10% write throughput drop per MV 
> added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
> ** Attached C40_MV.png  is alpha-4, 5-node, rf3 MV write tests.
>  * Read perf: identical to normal table
>  * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
>  * Repair: write path required
>  * Chaos monkey: take down coordinator/base-replica/view-replica during 
> read/write/token-movement and verify data consistency (may need a tool)
>  * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
>  * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-07-20 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15921:
-
Attachment: C40_MV.png

> 4.0 quality testing: Materialized View
> --
>
> Key: CASSANDRA-15921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
> Attachments: C40_MV.png
>
>
> The main purpose of this ticket to get a better understanding about 4.0 MV 
> status as a guideline for future improvements. I don't think it should block 
> 4.0 release since it's already marked as experimental.
> Main areas to test:
>  * Write perf: We expect to see [10% write throughput drop per MV 
> added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
>  * Read perf: identical to normal table
>  * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
>  * Repair: write path required
>  * Chaos monkey: take down coordinator/base-replica/view-replica during 
> read/write/token-movement and verify data consistency (may need a tool)
>  * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
>  * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-07-19 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15861:
-
Test and Documentation Plan: 
https://circleci.com/workflow-run/9e2af3a1-7b63-423d-8cde-d2cd178c81d6  (was: 
https://circleci.com/workflow-run/fde45c54-e845-4040-b59e-abcdabda2b29)
 Status: Patch Available  (was: Open)

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual fi

[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-07-19 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17160790#comment-17160790
 ] 

ZhaoYang commented on CASSANDRA-15861:
--

[~maedhroz] thanks for the suggestions.

bq. (where that "completion" happens in the non-SSL case isn't 100% clear to me)

The netty streaming itself is async, but 
{{CassandraEntireSSTableStreamWriter#write}} is actually blocking because 
{{AsyncStreamingOutputPlus#flush}} will wait for data being written to network. 
We don't need to worry about it.

I ended up with sstable read/write lock approach:
* During entire-sstable streaming, {{CassandraOutgoingFile}} will execute the 
streaming code within the sstable read-lock. So multiple streamings on the same 
sstable can start at the same time. I think it's fine to block 
stats-mutation/index-summary redistribution until streaming completion.
* For stats mutation and index summary redistribution, they will perform the 
component mutation in the sstable write-lock.
* Didn't reuse the synchronization on `tidy.global` because they are used in 
normal compaction tasks, so I added a separate read-write lock.

bq. simplest thing might be handling the stats an index summary in slightly 
different ways.

I feel handling stats differently may make it harder to maintain or to reason.

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-m

[jira] [Updated] (CASSANDRA-15766) NoSpamLogger arguments building objects on hot paths

2020-07-19 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15766:
-
Reviewers: ZhaoYang

LGTM

> NoSpamLogger arguments building objects on hot paths
> 
>
> Key: CASSANDRA-15766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15766
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Logging
>Reporter: Jon Meredith
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> NoSpamLogger is used in hot logging paths to prevent logs being overrun.  For 
> that to be most effective the arguments to the logger need to be cheap to 
> construct.  During the internode messaging refactor CASSANDRA-15066, 
> performance changes to BufferPool for CASSANDRA-14416
> were accidentally reverted in the merge up from 3.11.
> Reviewing other uses since, it looks like there are a few places where the 
> arguments require some form of String building.
> org.apache.cassandra.net.InboundSink#accept
> org.apache.cassandra.net.InboundMessageHandler#processCorruptFrame
> org.apache.cassandra.net.InboundMessageHandler.LargeMessage#deserialize
> org.apache.cassandra.net.OutboundConnection#onOverloaded
> org.apache.cassandra.utils.memory.BufferPool.GlobalPool#allocateMoreChunks
> Formatting arguments should either be precomputed, or if expensive they 
> should be computed after the decision on whether to noSpamLog has been made.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-07-18 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15861:
-
Test and Documentation Plan: 
https://circleci.com/workflow-run/fde45c54-e845-4040-b59e-abcdabda2b29  (was: 
https://circleci.com/workflow-run/3a2fed2c-c469-4f3f-a620-07079f0dc0db)

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS 
> component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to 
> mutate sstable level and "isTransient" attribute in 
> {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> Currently, entire-sstable-streaming requires sstable components to be 
> immutable, because \{{ComponentManifest}}
> with component sizes are sent before sending actual files. This isn't a 
> problem in legacy streaming as STATS file

[jira] [Updated] (CASSANDRA-15908) Improve messaging on indexing frozen collections

2020-07-14 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15908:
-
Reviewers: Bryn Cooke, ZhaoYang, ZhaoYang  (was: Bryn Cooke, ZhaoYang)
   Bryn Cooke, ZhaoYang, ZhaoYang
   Status: Review In Progress  (was: Patch Available)

> Improve messaging on indexing frozen collections
> 
>
> Key: CASSANDRA-15908
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15908
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Semantics
>Reporter: Rocco Varela
>Assignee: Rocco Varela
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When attempting to create an index on a frozen collection the error message 
> produced can be improved to provide more detail about the problem and 
> possible workarounds. Currently, a user will receive a message indicating 
> "...Frozen collections only support full() indexes" which is not immediately 
> clear for users new to Cassandra indexing and datatype compatibility.
> Here is an example:
> {code:java}
> cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> CREATE TABLE test.mytable ( id int primary key, addresses 
> frozen> );
> cqlsh> CREATE INDEX mytable_addresses_idx on test.mytable (addresses);
>  InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot 
> create values() index on frozen column addresses. Frozen collections only 
> support full() indexes"{code}
>  
> I'm proposing possibly enhancing the messaging to something like this.
> {quote}Cannot create values() index on frozen column addresses. Frozen 
> collections only support indexes on the entire data structure due to 
> immutability constraints of being frozen, wrap your frozen column with the 
> full() target type to index properly.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15908) Improve messaging on indexing frozen collections

2020-07-14 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15908:
-
Test and Documentation Plan: circle ci tests
 Status: Patch Available  (was: Open)

> Improve messaging on indexing frozen collections
> 
>
> Key: CASSANDRA-15908
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15908
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Semantics
>Reporter: Rocco Varela
>Assignee: Rocco Varela
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When attempting to create an index on a frozen collection the error message 
> produced can be improved to provide more detail about the problem and 
> possible workarounds. Currently, a user will receive a message indicating 
> "...Frozen collections only support full() indexes" which is not immediately 
> clear for users new to Cassandra indexing and datatype compatibility.
> Here is an example:
> {code:java}
> cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> CREATE TABLE test.mytable ( id int primary key, addresses 
> frozen> );
> cqlsh> CREATE INDEX mytable_addresses_idx on test.mytable (addresses);
>  InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot 
> create values() index on frozen column addresses. Frozen collections only 
> support full() indexes"{code}
>  
> I'm proposing possibly enhancing the messaging to something like this.
> {quote}Cannot create values() index on frozen column addresses. Frozen 
> collections only support indexes on the entire data structure due to 
> immutability constraints of being frozen, wrap your frozen column with the 
> full() target type to index properly.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15859) Avoid per-host hinted-handoff throttle being rounded to 0 in large cluster

2020-07-10 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155307#comment-17155307
 ] 

ZhaoYang commented on CASSANDRA-15859:
--

|patch|circle|
| [trunk|https://github.com/apache/cassandra/pull/616/files] | 
[ci|https://circleci.com/workflow-run/63f19a49-568a-4350-b368-9c33eeaa17de] | 
| [3.11|https://github.com/apache/cassandra/pull/674/files] | 
[ci|https://circleci.com/workflow-run/f18b7afa-36c7-4d7b-a5a3-9792528cc963] |
| [3.0|https://github.com/apache/cassandra/pull/673/files] | 
[ci|https://circleci.com/workflow-run/e2b22eef-f0b2-4752-a4b6-f1b5766e170c] |

ported to 3.0 and 3.11..
 

> Avoid per-host hinted-handoff throttle being rounded to 0 in large cluster
> --
>
> Key: CASSANDRA-15859
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15859
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> When "hinted_handoff_throttle_in_kb" is sufficiently small or num of nodes in 
> the cluster is sufficiently large, the per-host throttle will be rounded to 
> 0, aka. unthrottled.
>  
> {code:java|title=HintsDispatchExecutor.java}
> int throttleInKB = DatabaseDescriptor.getHintedHandoffThrottleInKB() / 
> nodesCount;
> this.rateLimiter = RateLimiter.create(throttleInKB == 0 ? Double.MAX_VALUE : 
> throttleInKB * 1024);
> {code}
> [trunk-patch|https://github.com/apache/cassandra/pull/616]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10307) Avoid always locking the partition key when a table has a materialized view

2020-07-09 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154367#comment-17154367
 ] 

ZhaoYang commented on CASSANDRA-10307:
--

the lock contention issue will not be a problem in thread-per-core 
architecture, but the lock is still needed to prevent racing with the previous 
insertion that is waiting for async io from read-before-write.

> Avoid always locking the partition key when a table has a materialized view
> ---
>
> Key: CASSANDRA-10307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10307
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Materialized Views
>Reporter: T Jake Luciani
>Priority: Normal
>  Labels: materializedviews
> Fix For: 4.x
>
>
> When a table has associated materialized views we must restrict other 
> concurrent changes to the affected rows.  We currently lock the entire 
> partition.  
> The issue is many updates to the same partition on the base table is now 
> serialized effectively.
> We can't lock the primary key instead due to range tombstones cover a range 
> of rows.
> If we created (or perhaps reuse if already exists) a clustering range class 
> we can lock at this level. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-07-07 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152771#comment-17152771
 ] 

ZhaoYang commented on CASSANDRA-15900:
--

thanks for the review
 

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-07-03 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15921:
-
Change Category: Quality Assurance
 Complexity: Normal
 Status: Open  (was: Triage Needed)

> 4.0 quality testing: Materialized View
> --
>
> Key: CASSANDRA-15921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> The main purpose of this ticket to get a better understanding about 4.0 MV 
> status as a guideline for future improvements. I don't think it should block 
> 4.0 release since it's already marked as experimental.
> Main areas to test:
>  * Write perf: We expect to see [10% write throughput drop per MV 
> added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
>  * Read perf: identical to normal table
>  * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
>  * Repair: write path required
>  * Chaos monkey: take down coordinator/base-replica/view-replica during 
> read/write/token-movement and verify data consistency (may need a tool)
>  * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
>  * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-07-03 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15921:
-
Issue Type: Task  (was: Improvement)

> 4.0 quality testing: Materialized View
> --
>
> Key: CASSANDRA-15921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> The main purpose of this ticket to get a better understanding about 4.0 MV 
> status as a guideline for future improvements. I don't think it should block 
> 4.0 release since it's already marked as experimental.
> Main areas to test:
>  * Write perf: We expect to see [10% write throughput drop per MV 
> added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
>  * Read perf: identical to normal table
>  * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
>  * Repair: write path required
>  * Chaos monkey: take down coordinator/base-replica/view-replica during 
> read/write/token-movement and verify data consistency (may need a tool)
>  * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
>  * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-07-03 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151008#comment-17151008
 ] 

ZhaoYang commented on CASSANDRA-15900:
--

both SimpleReadWriteTest and ImportTest passed locally with JDK11, I don't 
think they use streaming.

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-07-03 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15921:
-
Fix Version/s: 4.x

> 4.0 quality testing: Materialized View
> --
>
> Key: CASSANDRA-15921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.x
>
>
> The main purpose of this ticket to get a better understanding about 4.0 MV 
> status as a guideline for future improvements. I don't think it should block 
> 4.0 release since it's already marked as experimental.
> Main areas to test:
>  * Write perf: We expect to see [10% write throughput drop per MV 
> added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
>  * Read perf: identical to normal table
>  * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
>  * Repair: write path required
>  * Chaos monkey: take down coordinator/base-replica/view-replica during 
> read/write/token-movement and verify data consistency (may need a tool)
>  * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
>  * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-07-03 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-15921:


 Summary: 4.0 quality testing: Materialized View
 Key: CASSANDRA-15921
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/Materialized Views
Reporter: ZhaoYang
Assignee: ZhaoYang


The main purpose of this ticket to get a better understanding about 4.0 MV 
status as a guideline for future improvements. I don't think it should block 
4.0 release since it's already marked as experimental.

Main areas to test:
 * Write perf: We expect to see [10% write throughput drop per MV 
added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
 * Read perf: identical to normal table
 * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
 * Repair: write path required
 * Chaos monkey: take down coordinator/base-replica/view-replica during 
read/write/token-movement and verify data consistency (may need a tool)
 * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
 * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15918) materialized view rebuild automatically after drop multiple views

2020-07-02 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15918:
-
Component/s: Feature/Materialized Views

> materialized view rebuild automatically after drop multiple views
> -
>
> Key: CASSANDRA-15918
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15918
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema, Consistency/Repair, Feature/Materialized 
> Views
>Reporter: chonghao li
>Priority: Normal
>
> Background:
> Cassandra version: 3.0.12
> Our cassandra cluster has 9 host for DC1 and 3 host for DC2,
> each host :
> ||node||memory||disk||
> |DC1 1|256 GB|5*788GB|
> |DC1 2|256 GB|5*788GB|
> |DC1 3|256 GB|5*788GB|
> |DC1 4|256 GB|5*788GB|
> |DC1 5|256 GB|5*788GB|
> |DC1 6|256 GB|5*788GB|
> |DC1 7|512 GB|5 TB|
> |DC1 8|512 GB|5 TB|
> |DC1 9|512 GB|5 TB|
> |DC2 1|256 GB|8*788GB|
> |DC2 2|256 GB|8*788GB|
> |DC2 3|256 GB|8*788GB|
> by using nodetool status, node load in DC1 is about 1.5 TB, node load in DC2 
> is about 4 TB
> QPS: 270
> -
> Problem we met:
> In DC1 1 node, enter the cql command line and execute command like following 
> in sametime:
> "drop materialized view if exists view1; 
> drop materialized view if exists view2;
> drop materialized view if exists view3;
> drop materialized view if exists view4;"
> after a while, command line display warning like "schema version mismatch 
> detected..." (sorry we cannot find the exact output for that time)
> After that we find view files in node: "DC1 7" hasn't be deleted yet.
> at this moment, we find performance of cluster drop sharp, the cluster almost 
> stop response to any request.
> by runing:  select * from system.views_builds_in_progress;
> we can see several views were building.
> then we execte:
> 1,   nodetool stop VIEW_BUILD  in each node
> 2, in cql: delete from system.views_builds_in_progress where view_name=
> 3, rolling restart cassandra nodes
>  
> about an hours later, performance increase to normal.
> --
> Why this happen?
> How to avoid this problem?
> Any better way to deal with this problem?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15913) Avoid "ALLOW FILTERING" requirement for multiple restricted columns if index can handle them

2020-07-02 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-15913:


 Summary: Avoid "ALLOW FILTERING" requirement for multiple 
restricted columns if index can handle them
 Key: CASSANDRA-15913
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15913
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/SASI
Reporter: ZhaoYang


When executing following query, "ALLOW FILTERING" is required even if both 
columns are indexed by SASI and used in {{QueryPlan}}:
bq. SELECT * FROM table WHERE age="20" and address="SF"

We should consider providing a proper {{"QueryPlan"}} under {{"Index"}} 
interface to avoid "ALLOW FILTERING" when all restricted columns are handled by 
index.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-07-01 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149071#comment-17149071
 ] 

ZhaoYang edited comment on CASSANDRA-15900 at 7/1/20, 7:39 AM:
---

rebased and submit another round of ci: 
[j8|https://circleci.com/workflow-run/cdf55335-c876-450b-8bf9-1d778a2df806] and 
[j11|https://circleci.com/workflow-run/2080f225-f689-4243-ad67-288bef608640]

bq. test_restart_node_localhost - 
pushed_notifications_test.TestPushedNotifications should have been addressed by 
CASSANDRA-15677 a few days ago.

it's failing after rebase...

bq. J11 - readRepairTest - 
org.apache.cassandra.distributed.test.SimpleReadWriteTest
bq. J11 - testImportCorrupt - org.apache.cassandra.db.ImportTest

doesn't seem to be related.



was (Author: jasonstack):
rebased and submit another round of ci: 
[j8|https://circleci.com/workflow-run/cdf55335-c876-450b-8bf9-1d778a2df806] and 
[j11|https://circleci.com/workflow-run/2080f225-f689-4243-ad67-288bef608640]

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-30 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149071#comment-17149071
 ] 

ZhaoYang commented on CASSANDRA-15900:
--

rebased and submit another round of ci: 
[j8|https://circleci.com/workflow-run/cdf55335-c876-450b-8bf9-1d778a2df806] and 
[j11|https://circleci.com/workflow-run/2080f225-f689-4243-ad67-288bef608640]

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection

2020-06-30 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148401#comment-17148401
 ] 

ZhaoYang commented on CASSANDRA-15907:
--

{quote}If the number of stale results is very large (i.e. a "silent" replica 
exists in the vast majority of responses), won't those two approaches result in 
about the same performance profile? 
{quote}
the second approach will execute RFP requests in two places:
 # at the beginning of 2nd phase, based on the collected outdated rows from 1st 
phase. These RFP requests can run in parallel and the number can be large.
 # at merge-listener, for additional rows requested by SRP. These RFP requests 
have to run in serial, but the number is usually small.

 

> Operational Improvements & Hardening for Replica Filtering Protection
> -
>
> Key: CASSANDRA-15907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Feature/2i Index
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: 2i, memory
> Fix For: 4.0-beta
>
>
> CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i 
> and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a 
> few things we should follow up on, however, to make life a bit easier for 
> operators and generally de-risk usage:
> (Note: Line numbers are based on {{trunk}} as of 
> {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.)
> *Minor Optimizations*
> * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be 
> able to use simple arrays instead of lists for {{rowsToFetch}} and 
> {{originalPartitions}}. Alternatively (or also), we may be able to null out 
> references in these two collections more aggressively. (ex. Using 
> {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, 
> assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.)
> * {{ReplicaFilteringProtection:323}} - We may be able to use 
> {{EncodingStats.merge()}} and remove the custom {{stats()}} method.
> * {{DataResolver:111 & 228}} - Cache an instance of 
> {{UnaryOperator#identity()}} instead of creating one on the fly.
> * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather 
> rather than serially querying every row that needs to be completed. This 
> isn't a clear win perhaps, given it targets the latency of single queries and 
> adds some complexity. (Certainly a decent candidate to kick even out of this 
> issue.)
> *Documentation and Intelligibility*
> * There are a few places (CHANGES.txt, tracing output in 
> {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side 
> filtering protection" (which makes it seem like the coordinator doesn't 
> filter) rather than "replica filtering protection" (which sounds more like 
> what we actually do, which is protect ourselves against incorrect replica 
> filtering results). It's a minor fix, but would avoid confusion.
> * The method call chain in {{DataResolver}} might be a bit simpler if we put 
> the {{repairedDataTracker}} in {{ResolveContext}}.
> *Guardrails*
> * As it stands, we don't have a way to enforce an upper bound on the memory 
> usage of {{ReplicaFilteringProtection}} which caches row responses from the 
> first round of requests. (Remember, these are later used to merged with the 
> second round of results to complete the data for filtering.) Operators will 
> likely need a way to protect themselves, i.e. simply fail queries if they hit 
> a particular threshold rather than GC nodes into oblivion. (Having control 
> over limits and page sizes doesn't quite get us there, because stale results 
> _expand_ the number of incomplete results we must cache.) The fun question is 
> how we do this, with the primary axes being scope (per-query, global, etc.) 
> and granularity (per-partition, per-row, per-cell, actual heap usage, etc.). 
> My starting disposition   on the right trade-off between 
> performance/complexity and accuracy is having something along the lines of 
> cached rows per query. Prior art suggests this probably makes sense alongside 
> things like {{tombstone_failure_threshold}} in {{cassandra.yaml}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection

2020-06-29 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148356#comment-17148356
 ] 

ZhaoYang commented on CASSANDRA-15907:
--

As discussed with caleb, the memory issue is that potentially outdated rows in 
the 1st phase of replica-filtering-protection(RFP) do not count towards merged 
counter, so short-read-protect(SRP) can potentially query and cache all data in 
the query range if only one replica has data.

Some ideas to cap memory usage during RFP:
 * Single phase approach:
 ** Issue blocking RFP read immediately at {{MergeListener#onMergedRows}} when 
detecting potential outdated rows.
 ** This guarantees coordinator will cache at most "limit * replicas" num of 
rows assuming there are no tombstone..
 ** This should have similar performance as current 2-phase approach, but 
current approach can be optimized to execute RFP reads in parallel.
 * two-phase approach with SRP only at 2nd phase:
 ** the 1st phase is almost the same as current approach: collecting 
potentially outdated rows, but without SRP.
 ** in the second phase, issue RFP reads in parallel based on collected rows in 
1st phase.
 *** When parallel RFP reads complete, merge the responses (original + RFP) 
again using the merger described in previous approach, but only do blocking RFP 
for rows requested by SRP.
 ** With this approach, the amount of memory used is the same as single-phase 
approach. The num of blocking RFP reads from SRP rows are usually small.

 

> Operational Improvements & Hardening for Replica Filtering Protection
> -
>
> Key: CASSANDRA-15907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Feature/2i Index
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: 2i, memory
> Fix For: 4.0-beta
>
>
> CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i 
> and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a 
> few things we should follow up on, however, to make life a bit easier for 
> operators and generally de-risk usage:
> (Note: Line numbers are based on {{trunk}} as of 
> {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.)
> *Minor Optimizations*
> * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be 
> able to use simple arrays instead of lists for {{rowsToFetch}} and 
> {{originalPartitions}}. Alternatively (or also), we may be able to null out 
> references in these two collections more aggressively. (ex. Using 
> {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, 
> assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.)
> * {{ReplicaFilteringProtection:323}} - We may be able to use 
> {{EncodingStats.merge()}} and remove the custom {{stats()}} method.
> * {{DataResolver:111 & 228}} - Cache an instance of 
> {{UnaryOperator#identity()}} instead of creating one on the fly.
> * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather 
> rather than serially querying every row that needs to be completed. This 
> isn't a clear win perhaps, given it targets the latency of single queries and 
> adds some complexity. (Certainly a decent candidate to kick even out of this 
> issue.)
> *Documentation and Intelligibility*
> * There are a few places (CHANGES.txt, tracing output in 
> {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side 
> filtering protection" (which makes it seem like the coordinator doesn't 
> filter) rather than "replica filtering protection" (which sounds more like 
> what we actually do, which is protect ourselves against incorrect replica 
> filtering results). It's a minor fix, but would avoid confusion.
> * The method call chain in {{DataResolver}} might be a bit simpler if we put 
> the {{repairedDataTracker}} in {{ResolveContext}}.
> *Guardrails*
> * As it stands, we don't have a way to enforce an upper bound on the memory 
> usage of {{ReplicaFilteringProtection}} which caches row responses from the 
> first round of requests. (Remember, these are later used to merged with the 
> second round of results to complete the data for filtering.) Operators will 
> likely need a way to protect themselves, i.e. simply fail queries if they hit 
> a particular threshold rather than GC nodes into oblivion. (Having control 
> over limits and page sizes doesn't quite get us there, because stale results 
> _expand_ the number of incomplete results we must cache.) The fun question is 
> how we do this, with the primary axes being scope (per-query, global, etc.) 
> and granularity (per-partition, per-row, per-cell, actual heap usage, 

[jira] [Assigned] (CASSANDRA-15866) stream sstable attached index files entirely with data file

2020-06-27 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-15866:


Assignee: (was: ZhaoYang)

> stream sstable attached index files entirely with data file
> ---
>
> Key: CASSANDRA-15866
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15866
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Streaming
>Reporter: ZhaoYang
>Priority: Normal
>
> When sstable is streamed entirely, there is no need to rebuild sstable 
> attached index on receiver if index files can be streamed entirely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-25 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15900:
-
Test and Documentation Plan: 
[https://circleci.com/workflow-run/ba9f4692-da21-44e9-ac31-fe8d2e6215cb]  (was: 
[https://circleci.com/workflow-run/8d266871-2d78-4c67-80ec-3e817187af0c])

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-25 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15900:
-
Status: Review In Progress  (was: Changes Suggested)

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-25 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17145693#comment-17145693
 ] 

ZhaoYang commented on CASSANDRA-15900:
--

bq. It might be worthwhile to have a test in AsyncStreamingOutputPlusTest that 
verifies AsyncStreamingOutputPlus#writeFileToChannel() closes the provided 
channel.

+1

bq. AsyncStreamingOutputPlus#writeFileToChannel(FileChannel, StreamRateLimiter, 
int) and AsyncStreamingOutputPlus#writeFileToChannelZeroCopy() may be better 
off at private visibility, given we're treating them as transport-level 
implementation details. (Perhaps writeFileToChannel would be easier to test at 
package-private though.)

I left them as public and marked "@VisibleForTesting"..

bq. The JavaDoc for writeFileToChannel(FileChannel, StreamRateLimiter) is 
slightly out-of date now, given we've lowered the batch size for the SSL case. 
(We should make sure to preserve the bit about the method taking ownership of 
the FileChannel.)

+1

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15903) Doc update: stream-entire-sstable supports all compaction strategies and internode encryption

2020-06-25 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-15903:


 Summary: Doc update: stream-entire-sstable supports all compaction 
strategies and internode encryption
 Key: CASSANDRA-15903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15903
 Project: Cassandra
  Issue Type: Task
Reporter: ZhaoYang


As [~mck2] point out, doc needs to be updated for CASSANDRA-15657  and 
CASSANDRA-15740.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-24 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143778#comment-17143778
 ] 

ZhaoYang edited comment on CASSANDRA-15900 at 6/24/20, 4:58 PM:


[~djoshi] do you mind reviewing and checking it on apache ci?


was (Author: jasonstack):
[~djoshi] do you mind reviewing?

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15900:
-
Test and Documentation Plan: 
[https://circleci.com/workflow-run/8d266871-2d78-4c67-80ec-3e817187af0c]  (was: 
[https://circleci.com/workflow-run/48b5c613-f3a5-485f-ad0e-8362fddea5d8])

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15900:
-
Description: 
CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
file into user-space off-heap buffer when SSL is enabled, because netty doesn't 
support zero-copy with SSL.

But there are two issues:
 # file channel is not closed.
 # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, thus 
it's all allocated outside the pool and will cause large amount of allocations.

[Patch|https://github.com/apache/cassandra/pull/651]:
 # close file channel when the last batch is loaded into off-heap bytebuffer. I 
don't think we need to wait until buffer is flushed by netty.
 # reduce the batch to 64kb which is more buffer pool friendly when streaming 
entire sstable with SSL.

  was:
CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
file into user-space off-heap buffer when SSL is enabled, because netty doesn't 
support zero-copy with SSL.

But there are two issues:
# file channel is not closed.
# 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, thus 
it's all allocated outside the pool and will cause large amount of allocations.
 
[Patch|https://github.com/apache/cassandra/pull/651]:
# close file channel when the last batch is loaded into off-heap bytebuffer.
# reduce the batch to 64kb which is more buffer pool friendly when streaming 
entire sstable with SSL.


> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
>  # file channel is not closed.
>  # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, 
> thus it's all allocated outside the pool and will cause large amount of 
> allocations.
> [Patch|https://github.com/apache/cassandra/pull/651]:
>  # close file channel when the last batch is loaded into off-heap bytebuffer. 
> I don't think we need to wait until buffer is flushed by netty.
>  # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15900:
-
Test and Documentation Plan: 
[https://circleci.com/workflow-run/48b5c613-f3a5-485f-ad0e-8362fddea5d8]
 Status: Patch Available  (was: Open)

[~djoshi] do you mind reviewing?

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
> # file channel is not closed.
> # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, thus 
> it's all allocated outside the pool and will cause large amount of 
> allocations.
>  
> [Patch|https://github.com/apache/cassandra/pull/651]:
> # close file channel when the last batch is loaded into off-heap bytebuffer.
> # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15900:
-
 Bug Category: Parent values: Degradation(12984)Level 1 values: Resource 
Management(12995)
   Complexity: Normal
Discovered By: Code Inspection
Fix Version/s: 4.0-beta
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Close channel and reduce buffer allocation during entire sstable streaming 
> with SSL
> ---
>
> Key: CASSANDRA-15900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
> file into user-space off-heap buffer when SSL is enabled, because netty 
> doesn't support zero-copy with SSL.
> But there are two issues:
> # file channel is not closed.
> # 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, thus 
> it's all allocated outside the pool and will cause large amount of 
> allocations.
>  
> [Patch|https://github.com/apache/cassandra/pull/651]:
> # close file channel when the last batch is loaded into off-heap bytebuffer.
> # reduce the batch to 64kb which is more buffer pool friendly when streaming 
> entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15900) Close channel and reduce buffer allocation during entire sstable streaming with SSL

2020-06-24 Thread ZhaoYang (Jira)
ZhaoYang created CASSANDRA-15900:


 Summary: Close channel and reduce buffer allocation during entire 
sstable streaming with SSL
 Key: CASSANDRA-15900
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15900
 Project: Cassandra
  Issue Type: Bug
  Components: Legacy/Streaming and Messaging
Reporter: ZhaoYang
Assignee: ZhaoYang


CASSANDRA-15740 added the ability to stream entire sstable by loading on-disk 
file into user-space off-heap buffer when SSL is enabled, because netty doesn't 
support zero-copy with SSL.

But there are two issues:
# file channel is not closed.
# 1mb batch size is used. 1mb exceeds buffer pool's max allocation size, thus 
it's all allocated outside the pool and will cause large amount of allocations.
 
[Patch|https://github.com/apache/cassandra/pull/651]:
# close file channel when the last batch is loaded into off-heap bytebuffer.
# reduce the batch to 64kb which is more buffer pool friendly when streaming 
entire sstable with SSL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14754) Add verification of state machine in StreamSession

2020-06-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-14754:


Assignee: (was: ZhaoYang)

> Add verification of state machine in StreamSession
> --
>
> Key: CASSANDRA-14754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14754
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Streaming and Messaging
>Reporter: Jason Brown
>Priority: Normal
> Fix For: 4.x
>
>
> {{StreamSession}} contains an implicit state machine, but we have no 
> verification of the safety of the transitions between states. For example, we 
> have no checks to ensure we cannot leave the final states (COMPLETED, FAILED).
> I propose we add some program logic in {{StreamSession}}, tests, and 
> documentation to ensure the correctness of the state transitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15299) CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta

2020-06-24 Thread ZhaoYang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-15299:
-
Reviewers: Alex Petrov  (was: Alex Petrov, ZhaoYang)

> CASSANDRA-13304 follow-up: improve checksumming and compression in protocol 
> v5-beta
> ---
>
> Key: CASSANDRA-15299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Aleksey Yeschenko
>Assignee: Sam Tunnicliffe
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-alpha
>
>
> CASSANDRA-13304 made an important improvement to our native protocol: it 
> introduced checksumming/CRC32 to request and response bodies. It’s an 
> important step forward, but it doesn’t cover the entire stream. In 
> particular, the message header is not covered by a checksum or a crc, which 
> poses a correctness issue if, for example, {{streamId}} gets corrupted.
> Additionally, we aren’t quite using CRC32 correctly, in two ways:
> 1. We are calculating the CRC32 of the *decompressed* value instead of 
> computing the CRC32 on the bytes written on the wire - losing the properties 
> of the CRC32. In some cases, due to this sequencing, attempting to decompress 
> a corrupt stream can cause a segfault by LZ4.
> 2. When using CRC32, the CRC32 value is written in the incorrect byte order, 
> also losing some of the protections.
> See https://users.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf for 
> explanation for the two points above.
> Separately, there are some long-standing issues with the protocol - since 
> *way* before CASSANDRA-13304. Importantly, both checksumming and compression 
> operate on individual message bodies rather than frames of multiple complete 
> messages. In reality, this has several important additional downsides. To 
> name a couple:
> # For compression, we are getting poor compression ratios for smaller 
> messages - when operating on tiny sequences of bytes. In reality, for most 
> small requests and responses we are discarding the compressed value as it’d 
> be smaller than the uncompressed one - incurring both redundant allocations 
> and compressions.
> # For checksumming and CRC32 we pay a high overhead price for small messages. 
> 4 bytes extra is *a lot* for an empty write response, for example.
> To address the correctness issue of {{streamId}} not being covered by the 
> checksum/CRC32 and the inefficiency in compression and checksumming/CRC32, we 
> should switch to a framing protocol with multiple messages in a single frame.
> I suggest we reuse the framing protocol recently implemented for internode 
> messaging in CASSANDRA-15066 to the extent that its logic can be borrowed, 
> and that we do it before native protocol v5 graduates from beta. See 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderCrc.java
>  and 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderLZ4.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure

2020-06-23 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143096#comment-17143096
 ] 

ZhaoYang edited comment on CASSANDRA-15861 at 6/24/20, 4:44 AM:


{quote}Don't we write to a tmp file then do a atomic move and replace? So would 
we need to worry about a partial file?
{quote}
For index summary, it deletes first. (I believe the reason for deletion is that 
index summary file can be large, up to 2GB. It'd be nice to release the old 
file earlier if it's not used) Of course, we can change it to use temp file..


was (Author: jasonstack):
{quote}Don't we write to a tmp file then do a atomic move and replace? So would 
we need to worry about a partial file?
{quote}
For index summary, it deletes first. Of course, we can change it to use temp 
file..

> Mutating sstable component may race with entire-sstable-streaming(ZCS) 
> causing checksum validation failure
> --
>
> Key: CASSANDRA-15861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Consistency/Streaming, 
> Local/Compaction
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - 
> repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> {code:java|title=stacktrace}
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 
> CassandraEntireSSTableStreamReader.java:145 - [Stream 
> 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream 
> for table = keyspace1.standard1
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226)
>   at 
> org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140)
>   at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36)
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
>   at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Checksums do not match for 
> /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db
> {code}
>  
> In the above test, it executes "nodetool repair" on node1 and kills node2 
> during repair. At the end, node3 reports checksum validation failure on 
> sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies 
> sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be 
> transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 
> starts to broadcast repair-failure-message to all participants in 
> {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair 
> sessions at {{LocalSessions#failSession}} which triggers async background 
> compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and 
> pending repair id to null via  
> {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more 
> in-progress repair.
> 6. Node1 actually sends the sstable to node3 

  1   2   3   4   5   6   7   8   9   10   >