[ https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712610#comment-14712610 ]
Stefania edited comment on CASSANDRA-9673 at 8/26/15 6:54 AM: -------------------------------------------------------------- Thanks for all your work, it's in much better shape now. bq. As for migration, my preference would be to make it happen at the borders only. So I'd ask you again to move this live upgrade batchlog mutation conversion to MutationVerbHandler, and away from Keyspace::apply. The latter is just too deep. Done. I was trying to limit the impact so I pushed it down until I found the loop on the partition updates. I've added a check on the message version though, so we should be OK. bq. A standalone test for LegacyBatchMigrator::converBatchEntries would be nice. {{BatchLogManager.testReplay}} tests this (now that we moved the conversion to {{MutationVerbHandler}}) bq. I would also prefer to move the whole package to o.a.c.batchlog instead of o.a.c.db.batch. o.a.c.db is overpopulated, and, frankly, meaningless. I've been depopulating it slowly - with schema and hints being separate top-level packages already. Done. -- Hinted handoff dtests were broken so I reverted vints in {{HintsMessage}}. I think the hint size encoding must match what's stored on disk? If I am correct then I believe {{HintsBuffer.ENTRY_OVERHEAD_SIZE}} also must change, since it wasn't totally trivial to fix this I reverted {{HintsMessage}}. I did not revert {{Hint}} however, up to you if you want to fix everything here or move to another ticket. Regarding {{MaterializedViewLongTest}}, I ran it in a loop for 10 times and it failed once out of 10 times on both unpatched 3.0 and patched 3.0. There is one obvious problem in that the test is still looking at _system.batchlog_ rather than _system.batches_, I fixed that. I also think we need to wait for the ordinary mutations to be applied in the test as well, so that means the ordinary MUTATION stage. Finally shouldn't the mutations with only a local endpoint in SP.mutateMV be applied in MATERIALIZED_VIEW_MUTATION stage? See tentative fix [here|https://github.com/stef1927/cassandra/commit/7d772ebc731524a3cdf08225451e8022bc98b7be]. With the fix applied the test ran 20 times successfully but this doesn't necessarily mean it is OK now. Also, the change in SP may not be needed, provided we wait for the MUTATION stage as well in the test (I did not test without the changes to SP). was (Author: stefania): Thanks for all your work, it's in much better shape now. bq. As for migration, my preference would be to make it happen at the borders only. So I'd ask you again to move this live upgrade batchlog mutation conversion to MutationVerbHandler, and away from Keyspace::apply. The latter is just too deep. Done. I was trying to limit the impact so I pushed it down until I found the loop on the partition updates. I've added a check on the message version though, so we should be OK. bq. A standalone test for LegacyBatchMigrator::converBatchEntries would be nice. {{BatchLogManager.testReplay}} tests this (now that we moved the conversion to {{MutationVerbHandler}}) bq. I would also prefer to move the whole package to o.a.c.batchlog instead of o.a.c.db.batch. o.a.c.db is overpopulated, and, frankly, meaningless. I've been depopulating it slowly - with schema and hints being separate top-level packages already. Done. -- Hinted handoff dtests were broken so I reverted vints in {{HintsMessage}}. I think the hint size must match what's stored on disk? If I am correct then I believe {{HintsBuffer.ENTRY_OVERHEAD_SIZE}} also must change, since it wasn't totally trivial to fix this I reverted {{HintsMessage}}. I did not revert {{Hint}} however, up to you if you want to fix everything here or move to another ticket. Regarding {{MaterializedViewLongTest}}, I ran it in a loop for 10 times and it failed once out of 10 times on both unpatched 3.0 and patched 3.0. There is one obvious problem in that the test is still looking at _system.batchlog_ rather than _system.batches_, I fixed that. I also think we need to wait for the ordinary mutations to be applied in the test as well, so that means the ordinary MUTATION stage. Finally shouldn't the mutations with only a local endpoint in SP.mutateMV be applied in MATERIALIZED_VIEW_MUTATION stage? See tentative fix [here|https://github.com/stef1927/cassandra/commit/7d772ebc731524a3cdf08225451e8022bc98b7be]. With the fix applied the test ran 20 times successfully but this doesn't necessarily mean it is OK now. Also, the change in SP may not be needed, provided we wait for the MUTATION stage as well in the test (I did not test without the changes to SP). > Improve batchlog write path > --------------------------- > > Key: CASSANDRA-9673 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9673 > Project: Cassandra > Issue Type: Improvement > Reporter: Aleksey Yeschenko > Assignee: Stefania > Labels: performance > Fix For: 3.0 beta 2 > > Attachments: 9673_001.tar.gz, 9673_004.tar.gz, > gc_times_first_node_patched_004.png, gc_times_first_node_trunk_004.png > > > Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched > mutations into, before sending it to a distant node, generating unnecessary > garbage (potentially a lot of it). > With materialized views using the batchlog, it would be nice to optimise the > write path: > - introduce a new verb ({{Batch}}) > - introduce a new message ({{BatchMessage}}) that would encapsulate the > mutations, expiration, and creation time (similar to {{HintMessage}} in > CASSANDRA-6230) > - have MS serialize it directly instead of relying on an intermediate buffer > To avoid merely shifting the temp buffer to the receiving side(s) we should > change the structure of the batchlog table to use a list or a map of > individual mutations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)