[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-30 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986225#comment-13986225
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


Looks good - I went back and forth between re-using those data structures in 
the StreamCoordinator vs. limiting its scope to the StreamResultFuture only 
during implementation.  While this approach has a touch of duplication to it I 
think it's fine in order to keep the scope limited.

Test of v3 looks good.  I'm getting some NPE's from SSTableReader.scheduleTidy 
closing up IndexSummaries but I don't think that has anything to do with this 
patch.

+1

> Parallel streaming for sstableloader
> 
>
> Key: CASSANDRA-3668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Manish Zope
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: streaming
> Fix For: 2.1 rc1
>
> Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668-v3.txt, 3668_v2.txt, 
> 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance 
> of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance 
> of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
> problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its 
> Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-16 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971786#comment-13971786
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


Attached patch against trunk.  Enables multiple StreamSessions per host 
abstracted behind a new StreamCoordinator class.  I also nested the "slice up 
sstable ranges to # of requested connections" logic in there.  Lastly - I 
modified the command-line tool to always print stack-traces on exceptions as 
discussed in CASSANDRA-7015.


> Parallel streaming for sstableloader
> 
>
> Key: CASSANDRA-3668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Manish Zope
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: streaming
> Fix For: 2.1 beta2
>
> Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668_v2.txt, 
> 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance 
> of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance 
> of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
> problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its 
> Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-11 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966872#comment-13966872
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


Have a stabilized version again 2.0.6.  We have some other issues on trunk 
right now on the streaming path so I'll wait until we have that ironed out to 
rebase to trunk, re-test, and post patch.  Some performance #'s against a 
single node locally:
{code:title=single_node}
  Summary statistics:
 Connections per host: : 1
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 43382
 Average transfer rate (MB/s): : 22
 Peak transfer rate (MB/s):: 25
  Summary statistics:
 Connections per host: : 2
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 25794
 Average transfer rate (MB/s): : 38
 Peak transfer rate (MB/s):: 45
  Summary statistics:
 Connections per host: : 4
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 20063
 Average transfer rate (MB/s): : 48
 Peak transfer rate (MB/s):: 60
  Summary statistics:
 Connections per host: : 6
 Total files transferred:  : 76
 Total bytes transferred:  : 2037105326
 Total duration (ms):  : 19350
 Average transfer rate (MB/s): : 50
 Peak transfer rate (MB/s):: 66
{code}

With 3 nodes hosted locally on ccm and 6 connections per host, I'm pushing a 
comparable 65MB/s peak and 44MB/s average.

I'll update once we get trunk sorted out and I rebase to it.

> Parallel streaming for sstableloader
> 
>
> Key: CASSANDRA-3668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Manish Zope
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: streaming
> Fix For: 2.1 beta2
>
> Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
> 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance 
> of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance 
> of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
> problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its 
> Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-04-04 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960435#comment-13960435
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


A quick update on this - going the route of multiple StreamSessions per 
StreamPlan with the current architecture is going to require some 
restructuring.  The current design assumes a single socket for streaming and 
multiple StreamSessions means multiple ConnectionHandlers, all of which assume 
ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm 
working on promoting IncomingMessageHandler and OutgoingMessageHandler into 
higher-level abstractions that are responsible for polling the socket and 
dispatching to various StreamSessions based on deserialized session indices on 
the inbound or following the current PriorityQueue polling mechanism for the 
outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even 
over a single socket as my prelim parallelized stream testing is peaking at ~ 
55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing 
returns as we get higher.  Compared to the 24MB/s I'm benchmarking on a single 
connection it's still a respectable increase.

> Parallel streaming for sstableloader
> 
>
> Key: CASSANDRA-3668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Manish Zope
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: streaming
> Fix For: 2.1 beta2
>
> Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
> 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance 
> of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance 
> of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
> problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its 
> Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-03-04 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920122#comment-13920122
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


With streaming being overhauled in 2.0 it looks like I'll need to revisit the 
core implementation on this.

Got the following from Yuki:
"In streaming 2.0, transferring/receiving files to one destination is managed 
by StreamSession object.
To parallelize streaming, I think the easiest way is to make multiple 
StreamSession per host, instead of one per host that what we do right now.
So given set of SSTable files in bulk loader, divide those up to groups equal 
to number of parallel streaming, and assign each group to multiple 
StreamSessions of the same destination.

One thing to be careful in this approach is, you need to add one more id to 
streaming messages in order to distinguish messages to the same destination.
This also changes serialization format of the protocol so need to update 
streaming protocol version defined in StreamMessage."

> Parallel streaming for sstableloader
> 
>
> Key: CASSANDRA-3668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Manish Zope
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: streaming
> Fix For: 2.1 beta2
>
> Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
> 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance 
> of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance 
> of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
> problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its 
> Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2014-02-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900880#comment-13900880
 ] 

Jonathan Ellis commented on CASSANDRA-3668:
---

[~yukim] can you outline for Josh what is needed here and let him take it from 
there?

> Parallel streaming for sstableloader
> 
>
> Key: CASSANDRA-3668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Manish Zope
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: streaming
> Fix For: 2.1
>
> Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
> 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance 
> of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance 
> of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
> problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its 
> Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader

2013-04-10 Thread Viktor Jevdokimov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627604#comment-13627604
 ] 

Viktor Jevdokimov commented on CASSANDRA-3668:
--

Any progress on the issue?

> Parallel streaming for sstableloader
> 
>
> Key: CASSANDRA-3668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API
>Reporter: Manish Zope
>Assignee: Yuki Morishita
>Priority: Minor
>  Labels: streaming
> Fix For: 2.0
>
> Attachments: 3668-1.1.txt, 3668-1.1-v2.txt, 
> 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance 
> of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance 
> of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
> problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its 
> Working fine.
> Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira