[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986225#comment-13986225 ] Joshua McKenzie commented on CASSANDRA-3668: Looks good - I went back and forth between re-using those data structures in the StreamCoordinator vs. limiting its scope to the StreamResultFuture only during implementation. While this approach has a touch of duplication to it I think it's fine in order to keep the scope limited. Test of v3 looks good. I'm getting some NPE's from SSTableReader.scheduleTidy closing up IndexSummaries but I don't think that has anything to do with this patch. +1 > Parallel streaming for sstableloader > > > Key: CASSANDRA-3668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 > Project: Cassandra > Issue Type: Improvement > Components: API >Reporter: Manish Zope >Assignee: Joshua McKenzie >Priority: Minor > Labels: streaming > Fix For: 2.1 rc1 > > Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668-v3.txt, 3668_v2.txt, > 3688-reply_before_closing_writer.txt, sstable-loader performance.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > One of my colleague had reported the bug regarding the degraded performance > of the sstable generator and sstable loader. > ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 > As stated in above issue generator performance is rectified but performance > of the sstableloader is still an issue. > 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the > problem with sstableloader still exists. > So opening other issue so that sstbleloader problem should not go unnoticed. > FYI : We have tested the generator part with the patch given in 3589.Its > Working fine. > Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971786#comment-13971786 ] Joshua McKenzie commented on CASSANDRA-3668: Attached patch against trunk. Enables multiple StreamSessions per host abstracted behind a new StreamCoordinator class. I also nested the "slice up sstable ranges to # of requested connections" logic in there. Lastly - I modified the command-line tool to always print stack-traces on exceptions as discussed in CASSANDRA-7015. > Parallel streaming for sstableloader > > > Key: CASSANDRA-3668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 > Project: Cassandra > Issue Type: Improvement > Components: API >Reporter: Manish Zope >Assignee: Joshua McKenzie >Priority: Minor > Labels: streaming > Fix For: 2.1 beta2 > > Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668_v2.txt, > 3688-reply_before_closing_writer.txt, sstable-loader performance.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > One of my colleague had reported the bug regarding the degraded performance > of the sstable generator and sstable loader. > ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 > As stated in above issue generator performance is rectified but performance > of the sstableloader is still an issue. > 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the > problem with sstableloader still exists. > So opening other issue so that sstbleloader problem should not go unnoticed. > FYI : We have tested the generator part with the patch given in 3589.Its > Working fine. > Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966872#comment-13966872 ] Joshua McKenzie commented on CASSANDRA-3668: Have a stabilized version again 2.0.6. We have some other issues on trunk right now on the streaming path so I'll wait until we have that ironed out to rebase to trunk, re-test, and post patch. Some performance #'s against a single node locally: {code:title=single_node} Summary statistics: Connections per host: : 1 Total files transferred: : 76 Total bytes transferred: : 2037105326 Total duration (ms): : 43382 Average transfer rate (MB/s): : 22 Peak transfer rate (MB/s):: 25 Summary statistics: Connections per host: : 2 Total files transferred: : 76 Total bytes transferred: : 2037105326 Total duration (ms): : 25794 Average transfer rate (MB/s): : 38 Peak transfer rate (MB/s):: 45 Summary statistics: Connections per host: : 4 Total files transferred: : 76 Total bytes transferred: : 2037105326 Total duration (ms): : 20063 Average transfer rate (MB/s): : 48 Peak transfer rate (MB/s):: 60 Summary statistics: Connections per host: : 6 Total files transferred: : 76 Total bytes transferred: : 2037105326 Total duration (ms): : 19350 Average transfer rate (MB/s): : 50 Peak transfer rate (MB/s):: 66 {code} With 3 nodes hosted locally on ccm and 6 connections per host, I'm pushing a comparable 65MB/s peak and 44MB/s average. I'll update once we get trunk sorted out and I rebase to it. > Parallel streaming for sstableloader > > > Key: CASSANDRA-3668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 > Project: Cassandra > Issue Type: Improvement > Components: API >Reporter: Manish Zope >Assignee: Joshua McKenzie >Priority: Minor > Labels: streaming > Fix For: 2.1 beta2 > > Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, > 3688-reply_before_closing_writer.txt, sstable-loader performance.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > One of my colleague had reported the bug regarding the degraded performance > of the sstable generator and sstable loader. > ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 > As stated in above issue generator performance is rectified but performance > of the sstableloader is still an issue. > 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the > problem with sstableloader still exists. > So opening other issue so that sstbleloader problem should not go unnoticed. > FYI : We have tested the generator part with the patch given in 3589.Its > Working fine. > Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960435#comment-13960435 ] Joshua McKenzie commented on CASSANDRA-3668: A quick update on this - going the route of multiple StreamSessions per StreamPlan with the current architecture is going to require some restructuring. The current design assumes a single socket for streaming and multiple StreamSessions means multiple ConnectionHandlers, all of which assume ownership of polling the readChannel on a socket. To respect the single-socket-for-streaming paradigm we currently have, I'm working on promoting IncomingMessageHandler and OutgoingMessageHandler into higher-level abstractions that are responsible for polling the socket and dispatching to various StreamSessions based on deserialized session indices on the inbound or following the current PriorityQueue polling mechanism for the outbound rather than the current paradigm of being owned by a StreamSession. It doesn't look like we're at risk of a bottleneck on network resources even over a single socket as my prelim parallelized stream testing is peaking at ~ 55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing returns as we get higher. Compared to the 24MB/s I'm benchmarking on a single connection it's still a respectable increase. > Parallel streaming for sstableloader > > > Key: CASSANDRA-3668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 > Project: Cassandra > Issue Type: Improvement > Components: API >Reporter: Manish Zope >Assignee: Joshua McKenzie >Priority: Minor > Labels: streaming > Fix For: 2.1 beta2 > > Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, > 3688-reply_before_closing_writer.txt, sstable-loader performance.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > One of my colleague had reported the bug regarding the degraded performance > of the sstable generator and sstable loader. > ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 > As stated in above issue generator performance is rectified but performance > of the sstableloader is still an issue. > 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the > problem with sstableloader still exists. > So opening other issue so that sstbleloader problem should not go unnoticed. > FYI : We have tested the generator part with the patch given in 3589.Its > Working fine. > Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920122#comment-13920122 ] Joshua McKenzie commented on CASSANDRA-3668: With streaming being overhauled in 2.0 it looks like I'll need to revisit the core implementation on this. Got the following from Yuki: "In streaming 2.0, transferring/receiving files to one destination is managed by StreamSession object. To parallelize streaming, I think the easiest way is to make multiple StreamSession per host, instead of one per host that what we do right now. So given set of SSTable files in bulk loader, divide those up to groups equal to number of parallel streaming, and assign each group to multiple StreamSessions of the same destination. One thing to be careful in this approach is, you need to add one more id to streaming messages in order to distinguish messages to the same destination. This also changes serialization format of the protocol so need to update streaming protocol version defined in StreamMessage." > Parallel streaming for sstableloader > > > Key: CASSANDRA-3668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 > Project: Cassandra > Issue Type: Improvement > Components: API >Reporter: Manish Zope >Assignee: Joshua McKenzie >Priority: Minor > Labels: streaming > Fix For: 2.1 beta2 > > Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, > 3688-reply_before_closing_writer.txt, sstable-loader performance.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > One of my colleague had reported the bug regarding the degraded performance > of the sstable generator and sstable loader. > ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 > As stated in above issue generator performance is rectified but performance > of the sstableloader is still an issue. > 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the > problem with sstableloader still exists. > So opening other issue so that sstbleloader problem should not go unnoticed. > FYI : We have tested the generator part with the patch given in 3589.Its > Working fine. > Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900880#comment-13900880 ] Jonathan Ellis commented on CASSANDRA-3668: --- [~yukim] can you outline for Josh what is needed here and let him take it from there? > Parallel streaming for sstableloader > > > Key: CASSANDRA-3668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 > Project: Cassandra > Issue Type: Improvement > Components: API >Reporter: Manish Zope >Assignee: Joshua McKenzie >Priority: Minor > Labels: streaming > Fix For: 2.1 > > Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, > 3688-reply_before_closing_writer.txt, sstable-loader performance.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > One of my colleague had reported the bug regarding the degraded performance > of the sstable generator and sstable loader. > ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 > As stated in above issue generator performance is rectified but performance > of the sstableloader is still an issue. > 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the > problem with sstableloader still exists. > So opening other issue so that sstbleloader problem should not go unnoticed. > FYI : We have tested the generator part with the patch given in 3589.Its > Working fine. > Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627604#comment-13627604 ] Viktor Jevdokimov commented on CASSANDRA-3668: -- Any progress on the issue? > Parallel streaming for sstableloader > > > Key: CASSANDRA-3668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 > Project: Cassandra > Issue Type: Improvement > Components: API >Reporter: Manish Zope >Assignee: Yuki Morishita >Priority: Minor > Labels: streaming > Fix For: 2.0 > > Attachments: 3668-1.1.txt, 3668-1.1-v2.txt, > 3688-reply_before_closing_writer.txt, sstable-loader performance.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > One of my colleague had reported the bug regarding the degraded performance > of the sstable generator and sstable loader. > ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 > As stated in above issue generator performance is rectified but performance > of the sstableloader is still an issue. > 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the > problem with sstableloader still exists. > So opening other issue so that sstbleloader problem should not go unnoticed. > FYI : We have tested the generator part with the patch given in 3589.Its > Working fine. > Please let us know if you guys require further inputs from our side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira