[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-30 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986225#comment-13986225
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


Looks good - I went back and forth between re-using those data structures in 
the StreamCoordinator vs. limiting its scope to the StreamResultFuture only 
during implementation.  While this approach has a touch of duplication to it I 
think it's fine in order to keep the scope limited.

Test of v3 looks good.  I'm getting some NPE's from SSTableReader.scheduleTidy 
closing up IndexSummaries but I don't think that has anything to do with this 
patch.

+1

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1 rc1

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668-v3.txt, 3668_v2.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-16 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971786#comment-13971786
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


Attached patch against trunk.  Enables multiple StreamSessions per host 
abstracted behind a new StreamCoordinator class.  I also nested the slice up 
sstable ranges to # of requested connections logic in there.  Lastly - I 
modified the command-line tool to always print stack-traces on exceptions as 
discussed in CASSANDRA-7015.


 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1 beta2

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668_v2.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-11 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966872#comment-13966872
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


Have a stabilized version again 2.0.6.  We have some other issues on trunk 
right now on the streaming path so I'll wait until we have that ironed out to 
rebase to trunk, re-test, and post patch.  Some performance #'s against a 
single node locally:
{code:title=single_node}
  Summary statistics:
 Connections per host: : 1
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 43382
 Average transfer rate (MB/s): : 22
 Peak transfer rate (MB/s):: 25
  Summary statistics:
 Connections per host: : 2
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 25794
 Average transfer rate (MB/s): : 38
 Peak transfer rate (MB/s):: 45
  Summary statistics:
 Connections per host: : 4
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 20063
 Average transfer rate (MB/s): : 48
 Peak transfer rate (MB/s):: 60
  Summary statistics:
 Connections per host: : 6
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 19350
 Average transfer rate (MB/s): : 50
 Peak transfer rate (MB/s):: 66
{code}

With 3 nodes hosted locally on ccm and 6 connections per host, I'm pushing a 
comparable 65MB/s peak and 44MB/s average.

I'll update once we get trunk sorted out and I rebase to it.

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1 beta2

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-04 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960435#comment-13960435
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


A quick update on this - going the route of multiple StreamSessions per 
StreamPlan with the current architecture is going to require some 
restructuring.  The current design assumes a single socket for streaming and 
multiple StreamSessions means multiple ConnectionHandlers, all of which assume 
ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm 
working on promoting IncomingMessageHandler and OutgoingMessageHandler into 
higher-level abstractions that are responsible for polling the socket and 
dispatching to various StreamSessions based on deserialized session indices on 
the inbound or following the current PriorityQueue polling mechanism for the 
outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even 
over a single socket as my prelim parallelized stream testing is peaking at ~ 
55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing 
returns as we get higher.  Compared to the 24MB/s I'm benchmarking on a single 
connection it's still a respectable increase.

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1 beta2

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-03-04 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920122#comment-13920122
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


With streaming being overhauled in 2.0 it looks like I'll need to revisit the 
core implementation on this.

Got the following from Yuki:
In streaming 2.0, transferring/receiving files to one destination is managed 
by StreamSession object.
To parallelize streaming, I think the easiest way is to make multiple 
StreamSession per host, instead of one per host that what we do right now.
So given set of SSTable files in bulk loader, divide those up to groups equal 
to number of parallel streaming, and assign each group to multiple 
StreamSessions of the same destination.

One thing to be careful in this approach is, you need to add one more id to 
streaming messages in order to distinguish messages to the same destination.
This also changes serialization format of the protocol so need to update 
streaming protocol version defined in StreamMessage.

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1 beta2

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-02-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900880#comment-13900880
 ] 

Jonathan Ellis commented on CASSANDRA-3668:
---

[~yukim] can you outline for Josh what is needed here and let him take it from 
there?

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2013-04-10 Thread Viktor Jevdokimov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627604#comment-13627604
 ] 

Viktor Jevdokimov commented on CASSANDRA-3668:
--

Any progress on the issue?

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Yuki Morishita
Priority: Minor
  Labels: streaming
 Fix For: 2.0

 Attachments: 3668-1.1.txt, 3668-1.1-v2.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira