[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-09-02 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727027#comment-14727027
 ] 

Stefania commented on CASSANDRA-9673:
-

Thank you Aleksey for all the work you've put in. 

The dtest pull request is here:

https://github.com/riptano/cassandra-dtest/pull/496/files.

> Improve batchlog write path
> ---
>
> Key: CASSANDRA-9673
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Stefania
>  Labels: performance
> Fix For: 3.0 beta 2
>
> Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
> gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png
>
>
> Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
> mutations into, before sending it to a distant node, generating unnecessary 
> garbage (potentially a lot of it).
> With materialized views using the batchlog, it would be nice to optimise the 
> write path:
> - introduce a new verb ({{Batch}})
> - introduce a new message ({{BatchMessage}}) that would encapsulate the 
> mutations, expiration, and creation time (similar to {{HintMessage}} in 
> CASSANDRA-6230)
> - have MS serialize it directly instead of relying on an intermediate buffer
> To avoid merely shifting the temp buffer to the receiving side(s) we should 
> change the structure of the batchlog table to use a list or a map of 
> individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-09-02 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726964#comment-14726964
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

dtests failing exactly the same 8 tests that the vanilla cassandra-3.0 branch, 
utests have 2 UDF access control failures and the relatively common 
recoverymanager timeout.

Committed as 
[53a177a9150586e56408f25c959f75110a2997e7|https://github.com/apache/cassandra/commit/53a177a9150586e56408f25c959f75110a2997e7]
 to cassandra-3.0 and merged with trunk.

Thank you for your work and your patience.

> Improve batchlog write path
> ---
>
> Key: CASSANDRA-9673
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Stefania
>  Labels: performance
> Fix For: 3.0 beta 2
>
> Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
> gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png
>
>
> Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
> mutations into, before sending it to a distant node, generating unnecessary 
> garbage (potentially a lot of it).
> With materialized views using the batchlog, it would be nice to optimise the 
> write path:
> - introduce a new verb ({{Batch}})
> - introduce a new message ({{BatchMessage}}) that would encapsulate the 
> mutations, expiration, and creation time (similar to {{HintMessage}} in 
> CASSANDRA-6230)
> - have MS serialize it directly instead of relying on an intermediate buffer
> To avoid merely shifting the temp buffer to the receiving side(s) we should 
> change the structure of the batchlog table to use a list or a map of 
> individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-09-01 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726587#comment-14726587
 ] 

Stefania commented on CASSANDRA-9673:
-

bq. I meant checking for table emptiness at the beginning of startup migration, 
and logging once if not empty. 

Right, makes more sense now.

bq. Also, the logic in conversion for != 1 uuids seems a bit weird. It will 
never be the case that we'll get a mutation

The replay unit tests were still checking for 1.2 mutations.

bq. There was a bug in SP::syncWriteToBatchlog that is as old as is batchlog 
itself. We are using CL.ONE unconditionally, even when we have two endpoints. 
And both legacy/modern writes should be sharing the same callback, so I 
switched back to using WriteResponseHandler for both.

Thanks for fixing this.

bq.  I think I broke some replay tests. 

I removed the 1.2 mutations from the replay tests. They are fine now.

Rebased again and force pushed, if the latest CI is good then I'm +1 as well:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9673-3.0-dtest/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9673-3.0-testall/
Thanks!

> Improve batchlog write path
> ---
>
> Key: CASSANDRA-9673
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Stefania
>  Labels: performance
> Fix For: 3.0 beta 2
>
> Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
> gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png
>
>
> Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
> mutations into, before sending it to a distant node, generating unnecessary 
> garbage (potentially a lot of it).
> With materialized views using the batchlog, it would be nice to optimise the 
> write path:
> - introduce a new verb ({{Batch}})
> - introduce a new message ({{BatchMessage}}) that would encapsulate the 
> mutations, expiration, and creation time (similar to {{HintMessage}} in 
> CASSANDRA-6230)
> - have MS serialize it directly instead of relying on an intermediate buffer
> To avoid merely shifting the temp buffer to the receiving side(s) we should 
> change the structure of the batchlog table to use a list or a map of 
> individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-09-01 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726299#comment-14726299
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

I meant checking for table emptiness at the beginning of startup migration, and 
logging once if not empty. Logging on each conversion is obviously an overkill.

Also, the logic in conversion for != 1 uuids seems a bit weird. It will never 
be the case that we'll get a mutation on upgrade with a random uuid. Made it 
just use the counter unconditionally at all times.

There was a bug in {{SP::syncWriteToBatchlog}} that is as old as is batchlog 
itself. We are using CL.ONE unconditionally, even when we have two endpoints. 
And both legacy/modern writes should be sharing the same callback, so I 
switched back to using {{WriteResponseHandler}} for both.

Rebased against most recent cassandra-3.0 and force-pushed. I think I broke 
some replay tests. Could you have a look and fix, if necessary? And, if you are 
overall +1, I'll commit.

Thank you.

> Improve batchlog write path
> ---
>
> Key: CASSANDRA-9673
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Stefania
>  Labels: performance
> Fix For: 3.0 beta 2
>
> Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
> gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png
>
>
> Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
> mutations into, before sending it to a distant node, generating unnecessary 
> garbage (potentially a lot of it).
> With materialized views using the batchlog, it would be nice to optimise the 
> write path:
> - introduce a new verb ({{Batch}})
> - introduce a new message ({{BatchMessage}}) that would encapsulate the 
> mutations, expiration, and creation time (similar to {{HintMessage}} in 
> CASSANDRA-6230)
> - have MS serialize it directly instead of relying on an intermediate buffer
> To avoid merely shifting the temp buffer to the receiving side(s) we should 
> change the structure of the batchlog table to use a list or a map of 
> individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-27 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717556#comment-14717556
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

I got rid of {{BatchResponse}} and siwtched to just {{WriteReponse}}, which I 
made use a singleton instance of {{WriteResponse}}. We can't/shouldn't cache 
the message itself for batch responses, though (unlike with hints, where we 
can), because that has consequences for tracing (metadata in headers is 
per-message).

A few more things:
- {{BatchCallback::onFailure}} shouldn't just signal - it masks failure
- {{LegacyBatchlogMigrator}} should log at {{INFO}}, conditional on legacy 
batchlog table being empty
- {{LegacyBatchlogMigrator}} should try and calculate page size from sstable 
stats (like {{LegacyHintsMigrator}})

Nit: {{StorageProxy.BatchlogEndpoints}} constructor is using {{forEach}} for no 
apparent reason. Just use a for-loop there. Plus, {{this}} references there are 
redundant and thus against our code style.

I feel like I'm starting to lose focus here. I think we are in a good state and 
ready to commit once those are addressed - you've done a good job. So let's 
just do that, and I'll create follow-up JIRAs if necessary.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-27 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717573#comment-14717573
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

As for your timestamp question. We don't care about the original {{written_at}} 
not matching the timestamp from the timeuuid, at all. Let's just stick to 
always using {{UUIDGen.unixTimestamp(id)}} in case of v1, unconditionally.

Also, there is a bug in {{LBM::apply}}: we are calling {{Batch::createLocal}} 
using the current {{timestampMicros}}, whereas we should be using the original 
create time. Otherwise we risk to resurrect expired batches here, because of 
overly fresh {{creationTime}}.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-27 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717983#comment-14717983
 ] 

Stefania commented on CASSANDRA-9673:
-

bq. LegacyBatchlogMigrator should log at INFO, conditional on legacy batchlog 
table being empty

Do you mean only during migration or all the time? Also, when would we check 
that the table is empty: before migrating, after migrating or every time?

bq. LegacyBatchlogMigrator should try and calculate page size from sstable 
stats (like LegacyHintsMigrator)

I've applied the logic already available in {{BatchlogManager}}, which is not 
the same as {{LegacyHintsMigrator}}. Let me know if the latter is preferable.

bq. Also, there is a bug in LBM::apply: we are calling Batch::createLocal using 
the current timestampMicros, whereas we should be using the original create 
time. Otherwise we risk to resurrect expired batches here, because of overly 
fresh creationTime.

Is it sufficient to multiply the timestamp by 1000 or do we need to dig it out 
from the partition update? Not sure how to do this via 
{{QueryProcessor.executeInternalWithPaging}}.

Remaining points are done.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715387#comment-14715387
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

bq. If I am correct then I believe HintsBuffer.ENTRY_OVERHEAD_SIZE also must 
change, since it wasn't totally trivial to fix this I reverted HintsMessage.

I think the problem is that I forgot to update serialization logic in 
{{EncodedHintMessage}}. Re-running the dtests with it corrected, will let you 
know how it goes.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712610#comment-14712610
 ] 

Stefania commented on CASSANDRA-9673:
-

Thanks for all your work, it's in much better shape now.

bq. As for migration, my preference would be to make it happen at the borders 
only. So I'd ask you again to move this live upgrade batchlog mutation 
conversion to MutationVerbHandler, and away from Keyspace::apply. The latter is 
just too deep.

Done. I was trying to limit the impact so I pushed it down until I found the 
loop on the partition updates. I've added a check on the message version 
though, so we should be OK.

bq. A standalone test for LegacyBatchMigrator::converBatchEntries would be nice.

{{BatchLogManager.testReplay}} tests this (now that we moved the conversion to 
{{MutationVerbHandler}})

bq. I would also prefer to move the whole package to o.a.c.batchlog instead of 
o.a.c.db.batch. o.a.c.db is overpopulated, and, frankly, meaningless. I've been 
depopulating it slowly - with schema and hints being separate top-level 
packages already.

Done.

--

Hinted handoff dtests were broken so I reverted vints in {{HintsMessage}}. I 
think the hint size must match what's stored on disk? If I am correct then I 
believe {{HintsBuffer.ENTRY_OVERHEAD_SIZE}} also must change, since it wasn't 
totally trivial to fix this I reverted {{HintsMessage}}. I did not revert 
{{Hint}} however, up to you if you want to fix everything here or move to 
another ticket.

Regarding {{MaterializedViewLongTest}}, I ran it in a loop for 10 times and it 
failed once out of 10 times on both unpatched 3.0 and patched 3.0. There is one 
obvious problem in that the test is still looking at _system.batchlog_ rather 
than _system.batches_, I fixed that. I also think we need to wait for the 
ordinary mutations to be applied in the test as well, so that means the 
ordinary MUTATION stage. Finally shouldn't the mutations with only a local 
endpoint in SP.mutateMV be applied in MATERIALIZED_VIEW_MUTATION stage? See 
tentative fix 
[here|https://github.com/stef1927/cassandra/commit/7d772ebc731524a3cdf08225451e8022bc98b7be].
 With the fix applied the test ran 20 times successfully but this doesn't 
necessarily mean it is OK now. Also, the change in SP may not be needed, 
provided we wait for the MUTATION stage as well in the test (I did not test 
without the changes to SP).

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713510#comment-14713510
 ] 

T Jake Luciani commented on CASSANDRA-9673:
---

Then the whole base mutation is aborted. 

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713411#comment-14713411
 ] 

Stefania commented on CASSANDRA-9673:
-

Understood. And how are exceptions handled? Say, the first mutation is applied 
and the second throws, they don't seem to be added to the batchlog either?

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713584#comment-14713584
 ] 

Stefania commented on CASSANDRA-9673:
-

Can they ever timeout out when building? 

{{MaterializedViewBuilder.buildKey}} catches a WTE thrown by {{SP.mutateMV()}} 
and has a warning that says _Encountered write timeout when building 
materialized view, the entries were stored in the batchlog and will be replayed 
at another time_. That doesn't look true unless there is another batchlog 
stored elsewhere that I missed.

Also, when building it doesn't look like we are in a mutation stage but in the 
compaction executor.

Anyway, I think I understand a bit better now, thanks for your answers.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713361#comment-14713361
 ] 

T Jake Luciani commented on CASSANDRA-9673:
---

bq. shouldn't the mutations with only a local endpoint in SP.mutateMV be 
applied in MATERIALIZED_VIEW_MUTATION stage?

No because it just slows things down.   We are already in a mutation stage, so 
its faster to just apply the memtable in the current thread vs sending off to 
another.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713399#comment-14713399
 ] 

T Jake Luciani commented on CASSANDRA-9673:
---

I've opened CASSANDRA-10197 to address the flaky test.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-26 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715834#comment-14715834
 ] 

Stefania commented on CASSANDRA-9673:
-

bq. I think the problem is that I forgot to update serialization logic in 
EncodedHintMessage.  Re-running the dtests with it corrected, will let you know 
how it goes.

I missed the obvious place to check yesterday, thanks for fixing this. Your CI 
results look OK.

I reverted the unrelated changes to {{SP.mutateMV()}} and made a tiny change to 
{{HintsMessage}}. CI is pending. 

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-25 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711328#comment-14711328
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

To quote Carl, I'm not sure if the BL needed to be separated since the MV were 
moved off; it probably isn't a problem now that we don't do remote replica 
batchlog.

I'll rebase on top of the latest 3.0 and see if this is still an issue, but let 
me second that I'm extremely not fond of a new stage just for BL writes.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-25 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711322#comment-14711322
 ] 

T Jake Luciani commented on CASSANDRA-9673:
---

bq. However it doesn't go through SP.mutateMV() so I don't see how we could 
have broken it, perhaps because we've removed the dedicated stage? 

You removed the dedicated stage? Yes that would cause these timeouts.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-25 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711349#comment-14711349
 ] 

T Jake Luciani commented on CASSANDRA-9673:
---

Actually, this happens directly without submitting to another stage.  So let me 
look at the patch and see if anything changed there.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-25 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710673#comment-14710673
 ] 

Stefania commented on CASSANDRA-9673:
-

Here are the CI results (build #5 still pending).

Failing utests:

Build #4
* {{org.apache.cassandra.db.RecoveryManagerTest.testRecoverPITUnordered}} - 
timed out also on 3.0 (build #105)

Build #3:
* 
{{org.apache.cassandra.io.sstable.IndexSummaryManagerTest.testRedistributeSummaries-compression}}
 - timed out due to a compaction, seems unrelated

* {{org.apache.cassandra.cql3.MaterializedViewLongTest.testConflictResolution}} 
- this does worry me, sometimes it passes and sometimes it fails, it seems to 
always pass on 3.0. However it doesn't go through {{SP.mutateMV()}} so I don't 
see how we could have broken it, perhaps because we've removed  the dedicated 
stage? Sample failure 
[here|http://cassci.datastax.com/job/stef1927-9673-3.0-testall/3/testReport/org.apache.cassandra.cql3/MaterializedViewLongTest/testConflictResolution/].

The failing dtests seem inline with 3.0. Note that I have a fix for the two 
failing tests in batch_test.py 
[here|https://github.com/riptano/cassandra-dtest/pull/496/commits].

I also added a function ({{nanoSince()}}) to distinguish legacy mutations with 
clashing timestamps but it's very slow - do you think we need this?

{code}
UUID newId = id;
if (id.version() != 1 || timestamp != UUIDGen.unixTimestamp(id))
newId = UUIDGen.getTimeUUID(timestamp, nanoSince(id, timestamp));
{code}

As far as I understand only 1.2 mutations would have non-time UUIDs, 
{{id.version() != 1}}, however strictly speaking time uuids would not 
necessarily match {{written_at}}, which is the current time in micros divided 
1000, whereas the uuid would have been created a little earlier, so if we 
crossed the millisecond boundary we would have {{timestamp != 
UUIDGen.unixTimestamp(id)}}. The code I am talking about is in 
{{LegacyBatchMigrator.apply}} and it is called when applying legacy mutations. 
WDYT?

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-25 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712186#comment-14712186
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

Pushed a bit more to [my 
branch|https://github.com/iamaleksey/cassandra/commits/9673-3.0] on top of your 
changes.

1. Makes it so that only {{BatchlogManager}} itself is aware of 
{{system.batches}} table and does any writes to it (we already had 
{{BatchlogManager::deleteBatch}} method, in fact)
2. Changes the codes so that we don't allocate a redundant {{ArrayList}} in 
{{Batch}}, and so that we never mutate those collections (encoded/decoded) in 
place
3. Removes (now) redundant extra schedule call at {{BatchlogManager}} startup
4. Makes it so that if we are dealing with an encoded (remote) batch, then the 
mutations are always in the current messaging version format. Having the 
version separate felt brittle.
5. Switches to vints for batch and hint encoding

One of the goals for me was overall consistency with hints code, since the two 
are very related. After some coding, however, I realized that 
{{BatchStoreMessage}} was indeed redundant. It carries no extra information 
other than the {{Batch}} itself (unlike {{HintMessage}} that also carries the 
host id). Symmetry with {{BatchRemoveMessage}} was nice, but the latter was 
also completely redundant - it merely wraps the UUID and adds nothing. So I 
ditched both classes, and suggest that we just marshal Batch/UUID instances, 
raw.

I'm not done with the review yet, and don't have an answer to your uuid 
question atm. Just wanted to push the latest.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-25 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712198#comment-14712198
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

As for migration, my preference would be to make it happen at the borders only. 
So I'd ask you again to move this live upgrade batchlog mutation conversion to 
{{MutationVerbHandler}}, and away from {{Keyspace::apply}}. The latter is just 
too deep.

A standalone test for {{LegacyBatchMigrator::converBatchEntries}} would be nice.

I would also prefer to move the whole package to o.a.c.batchlog instead of 
o.a.c.db.batch. o.a.c.db is overpopulated, and, frankly, meaningless. I've been 
depopulating it slowly - with schema and hints being separate top-level 
packages already.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-25 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712205#comment-14712205
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

P.S. Shouldn't be part of this ticket, but I only now noticed how poor our 
metrics for batchlog are. As in, we've got no metrics whatsoever. Will open a 
separate ticket to track how many batches were written, accepted, removed, 
replayed.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-24 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709544#comment-14709544
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

{{remove(UUID batchId)}} should probably be a method in {{BatchlogManager}} 
itself.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-24 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710437#comment-14710437
 ] 

Stefania commented on CASSANDRA-9673:
-

This is done, I'm just waiting for the CI results:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9673-3.0-dtest/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9673-3.0-testall/

You are correct, the previous strategy was to handle legacy mutations by 
invoking {{LegacyBatchMigrator.convertBatchEntries}} every time before 
replaying: 

_For replaying legacy mutations, I've opted for a conversion done before 
replaying, which is not very efficient, but keeps the code clean. I figured 
mixed version clusters are transient but if it concerns you I can enhance it._

I changed it so that legacy mutations are converted on-the-fly before being 
applied.

I created a new class, {{Batch}}, and moved the {{BatchStoreMessage}} logic 
there, so the messages are only concerned with serialization. The static remove 
method, I actually prefer to have it in {{Batch}} rather than 
{{BatchLogManager}} but you can move it if you prefer it in 
{{BatchlogManager}}. The remaining nits should also be addressed.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-23 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708398#comment-14708398
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

Rebased against most recent trunk and made some minor changes, here: 
https://github.com/iamaleksey/cassandra/tree/9673-3.0. Didn't go very far. 
Removed {{BATCHLOG_MUTATION}} Stage that Carl and I think is no longer needed 
(but we'll bring it back if that's a mistake), and did some renames and minor 
cleanup of unused methods.

Good work. There is only one real issue so far, with that fixed, we could 
commit as is, but I have a bunch of nits that'd be nice to address: we are not 
properly handling old-format batchlog mutations sent live from 2.1/2.2 nodes, 
during live upgrades.

You need to modify {{MutationVerbHandler}} and add special treatment to 
mutations for the old batchlog table.

Nits:
1. I would prefer {{BatchStoreMessage}} and {{BatchRemoveMessage}} to be dumb 
and only be concerned about serialization/deserialization. This means no 
{{getMutation()}} in either. For {{BatchRemoveMessage}}, the actual logic 
should be in {{BatchRemoveVerbHandler}}.
2. I would like to separate {{Batch}} and {{BatchStoreMessage}} and put actual 
saving to batchlog logic into a new {{Batch}} class - to be used both for local 
(MV) writes and proper BL remote writes.
3. If moved to {{Batch}} (and if not, too), I would prefer the mutation 
collections in them to only be set in the constructor.
4. Somehow we are calling {{LegacyBatchMigrator.convertBatchEntries()}} on 
every replay, reverting Branimir's changes, instead of explicitly only doing it 
once, upon first replay. I'm presuming that this is done as a workaround for 
the live upgrade issue. I would prefer that we moved the migration to 
{{CassandraDaemon.setup()}}, where legacy schema and hints migration already 
get called.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
  Labels: performance
 Fix For: 3.0 beta 2

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-13 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695004#comment-14695004
 ] 

Stefania commented on CASSANDRA-9673:
-

The tests should be good now, but I've launched one more build just in case. 
The results will be here:

http://cassci.datastax.com/job/stef1927-9673-dtest/
http://cassci.datastax.com/job/stef1927-9673-testall/

This is ready for review, pay attention to [this 
line|https://github.com/stef1927/cassandra/blob/b65cdb515ac3144116aaae7836d6ae1befd197b6/src/java/org/apache/cassandra/service/StorageProxy.java#L680]
 that I moved out of the for loop in {{mutateMV}}. It made no sense to have it 
inside the loop, and it was causing some MV utests to time-out on Jenkins.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-12 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693202#comment-14693202
 ] 

Stefania commented on CASSANDRA-9673:
-

[~iamaleksey] I still need to do mixed version testing and check the CI results 
but the rebase on 3.0 and the changes you requested are available for review if 
you want to speed things up. Else I'll post another update when the tests are 
complete.

I've left you a question in SS with a TODO, I am no sure why in {{mutateMV}} we 
insert a batch mutation containing all mutations, for every single mutation, it 
seems wrong to me.

For replaying legacy mutations, I've opted for a conversion done before 
replaying, which is not very efficient, but keeps the code clean. I figured 
mixed version clusters are transient but if it concerns you I can enhance it.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-06 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660112#comment-14660112
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

[~Stefania] Now that CASSANDRA-7237 has been resolved, can you reapply the 
patch on top of latest cassandra-3.0? This will be more difficult than a 
regular rebase.

CASSANDRA-6477 intervened with MS verbs for batchlog, and CASSANDRA-7237 added 
the new table. In addition to compatibility measures in your existing patch, we 
now also need to intercept mutations for the old batchlog table and properly 
convert them to mutations for the new table, with the new structure.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-08-06 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660176#comment-14660176
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

Oh, and one minor thing - I think it's okay now to start reusing the deprecated 
old verbs for new purposes. {{STREAM_INITIATE}} and {{STREAM_INITIATE_DONE}} 
can be taken over for bathlog purposes.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-27 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642699#comment-14642699
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

It's not been forgotten - will finish the review shortly, sorry for a delay.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-20 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634026#comment-14634026
 ] 

Joshua McKenzie commented on CASSANDRA-9673:


Thanks for putting these results together [~stefania_alborghetti] - we as a 
project need to be more disciplined about producing results like these and it 
really helps clarify both the improvement and the use-case that will benefit 
from the change.

Looking good!

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-16 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629523#comment-14629523
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

You may want to try (much) larger batches to show that there is a difference 
(assuming this user profile does what you think it's doing in the first place).

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673.tar.gz


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-16 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629357#comment-14629357
 ] 

Stefania commented on CASSANDRA-9673:
-

I've attached an archive containing .jfr files for trunk and the patched 
branch. I generated the files using the following dtest:

{code}
def logged_batch_stress_test(self):

@jira_ticket CASSANDRA-9673, a stress test to record any improvements 
in GC usage

cluster = self.cluster

cluster.populate(3)
cluster.start(wait_other_notice=True, 
jvm_args=[-XX:+UnlockCommercialFeatures, -XX:+FlightRecorder])
nodes = cluster.nodelist()

self._start_jfr_recording(nodes)

nodes[0].stress(['user', 'profile=/home/stefania/git/cstar/9673.yaml', 
'ops(insert=1,)', 'n=5', '-rate', 'threads=8'])

self._dump_jfr_recording(nodes)

def _start_jfr_recording(self, nodes):

Start jfr recording provided the cluster was started with 
jvm_args=[-XX:+UnlockCommercialFeatures, -XX:+FlightRecorder]

for node in nodes:
p = subprocess.Popen(['jcmd', str(node.pid), 'JFR.start'],
 stdout=subprocess.PIPE,
 stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
debug(stdout)
debug(stderr)

def _dump_jfr_recording(self, nodes):

Save jfr recording to file

for node in nodes:
p = subprocess.Popen(['jcmd', str(node.pid), 'JFR.dump', 
'recording=1', 'filename=recording_{}.jfr'.format(node.address())],
 stdout=subprocess.PIPE,
 stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
debug(stdout)
debug(stderr)
{code}

9673.yaml is included in the archive attached or available 
[here|https://dl.dropboxusercontent.com/u/15683245/9673.yaml]. I couldn't 
figure out any other way to use logged batches in cassandra-stress other than 
with a user schema, that is the only reason for the user schema.

I've also run a cperf test with the same stress test:

http://cstar.datastax.com/tests/id/20a0d848-2b84-11e5-be06-42010af0688f

I don't notice any differences in the cperf test and I am not at all familiar 
with analyzing .jfr files (first time I use FlightRecorder ever). If anything 
it seems to me the patched branch uses less memory but has more GCs but perhaps 
I should have used a bigger sample. I also only looked at the coordinator .jfrs 
(the first node).

[~JoshuaMcKenzie], any suggestions or comments?

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673.tar.gz


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-16 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630677#comment-14630677
 ] 

Stefania commented on CASSANDRA-9673:
-

The user profile is correct, I verified with a temporary debug line that 
SP.syncWriteToBatchlog() is executing.

With progressively bigger tests it gets clearer however that the GC times 
improve with the patched version. I've attached the results with *n=50*, 
file 004, along with the JMC screenshots of the GC times of the first node. 

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1

 Attachments: 9673_001.tar.gz, 9673_004.tar.gz, 
 gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-15 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628104#comment-14628104
 ] 

Joshua McKenzie commented on CASSANDRA-9673:


While logically this looks like a clear win for batchlog usage w/gc pressure, 
could we get a couple of .jfr attached to the ticket to illustrate the 
difference made by the patch?

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-15 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627593#comment-14627593
 ] 

Stefania commented on CASSANDRA-9673:
-

[~iamaleksey], the code is ready for a first round of review. The only things 
still shared with write mutations are 
{{DatabaseDescriptor.getWriteRpcTimeout()}} and {{WriteTimeoutException}}. Let 
me know if you want to change these two as well, however I assume you don't 
want to change the WRITE_TIMEOUT error code in the native protocol.

As for testing, in addition to the unit tests, I wrote two new dtests to check 
that we can still inter-operate with 2.2 nodes. However until CASSANDRA-9704 is 
delivered we cannot run these tests. I feel like we should have more tests in 
this area, for example I couldn't find any test checking that we actually 
replay a batch in case of failure, but as usual I don't know how to write these 
tests without a way to inject failures or to mock objects.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626160#comment-14626160
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

bq. We don't need to, but it might already have been done by CASSANDRA-6477.

Actually, scratch that. On the receiving side, we most definitely shouldn't - 
should be reusing the {{MUTATION}} stage.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-14 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626143#comment-14626143
 ] 

Stefania commented on CASSANDRA-9673:
-

[~iamaleksey] do we also need a BATCH_RESPONSE verb or should we just keep on 
using REQUEST_RESPONSE? I've added BATCH_RESPONSE but it is functionally 
identical to REQUEST_RESPONSE. The same goes for the handler, 
WriteResponseHandler, which I have not duplicated instead. Another question is 
whether we need to introduce a new stage or is it OK to keep on using 
Stage.MUTATION?

I started writing dtest to check that we can still support older nodes, e.g. 
2.2, but things are quite broken at the moment, for example in ReadCommand 
serializer:

{code}
if (version  MessagingService.VERSION_30)
throw new UnsupportedOperationException();
{code}

cc [~slebresne] - do we already have a ticket or plan for fixing compatibility 
with older nodes?


 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626150#comment-14626150
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

bq. do we also need a BATCH_RESPONSE verb or should we just keep on using 
REQUEST_RESPONSE? I've added BATCH_RESPONSE but it is functionally identical to 
REQUEST_RESPONSE.

No, we reuse {{REQUEST_RESPONSE}} for almost everything. You do create a new 
response message class, but reuse the same verb. Look at the branch for 
CASSANDRA-6230 to see how {{HintResponse}} is being handled.

bq. Another question is whether we need to introduce a new stage or is it OK to 
keep on using Stage.MUTATION?

We don't *need* to, but it might already have been done by CASSANDRA-6477.

bq. do we already have a ticket or plan for fixing compatibility with older 
nodes?

CASSANDRA-9704

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-13 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624419#comment-14624419
 ] 

Stefania commented on CASSANDRA-9673:
-

[~iamaleksey] can you elaborate a bit more on how to achieve this:

bq. To avoid merely shifting the temp buffer to the receiving side(s) we should 
change the structure of the batchlog table to use a list or a map of individual 
mutations.

On the receiver, is there a way to do this without deserializing each mutation 
only to serialize it again into a blob (byte buffer) or alternatively something 
like {{listmutation_type}} where {{mutation_type}} is

{code}
ksname : string
key : blob
updates : mapuuid, blob
{code}

Isn't this worse than just leaving all mutations serialized in a unique byte 
buffer copied directly from the incoming message and inserted in the batchlog 
table as the data blob?  

We could use the {{BufferPool}} to minimize GC, except we just need to find a 
way to give the buffers back to the pool eventually.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-13 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624500#comment-14624500
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

Yes, I thought I mentioned it offline (and it should really all be in the 
ticket, either way, so my bad). The mutations will now be stored in a map 
{{something, blob}}.

The missing, and, again, unexplained part is that we don't decode the message 
from {{ByteBuffer}}s into {{Mutation}}s on the receiving side.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-07-13 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624618#comment-14624618
 ] 

Stefania commented on CASSANDRA-9673:
-

Thanks Aleksey, as discussed offline, we'll got with a list of {{blob}}, one 
per mutation, no need to decode any mutation.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Stefania
 Fix For: 3.0.0 rc1


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9673) Improve batchlog write path

2015-06-28 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605020#comment-14605020
 ] 

Aleksey Yeschenko commented on CASSANDRA-9673:
--

Marking as 3.X because it's not blocking 3.0, but it would be nice to have it 
in before the RC happens.

 Improve batchlog write path
 ---

 Key: CASSANDRA-9673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
 Fix For: 3.x


 Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched 
 mutations into, before sending it to a distant node, generating unnecessary 
 garbage (potentially a lot of it).
 With materialized views using the batchlog, it would be nice to optimise the 
 write path:
 - introduce a new verb ({{Batch}})
 - introduce a new message ({{BatchMessage}}) that would encapsulate the 
 mutations, expiration, and creation time (similar to {{HintMessage}} in 
 CASSANDRA-6230)
 - have MS serialize it directly instead of relying on an intermediate buffer
 To avoid merely shifting the temp buffer to the receiving side(s) we should 
 change the structure of the batchlog table to use a list or a map of 
 individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)