[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-11-23 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8670:
--
Component/s: Streaming and Messaging
 Local Write-Read Paths
 Compaction

> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction, Local Write-Read Paths, Streaming and 
> Messaging
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 2.2.0 beta 1
>
> Attachments: OutputStreamBench.java, largecolumn_test.py
>
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger than the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a high watermark and stay 
> there. This ends up costing 2x the largest cell size because there is a 
> buffer for input and output since they are different threads. This is further 
> multiplied by the number of nodes in the cluster - 1 since each has a 
> dedicated thread pair with separate thread locals.
> Anecdotally it appears that the cost is doubled beyond that although it isn't 
> clear why. Possibly the control connections or possibly there is some way in 
> which multiple 
> Need a workload in CI that tests the advertised limits of cells on a cluster. 
> It would be reasonable to ratchet down the max direct memory for the test to 
> trigger failures if a memory pooling issue is introduced. I don't think we 
> need to test concurrently pulling in a lot of them, but it should at least 
> work serially.
> The obvious fix to address this issue would be to read in smaller chunks when 
> dealing with large values. I think small should still be relatively large (4 
> megabytes) so that code that is reading from a disk can amortize the cost of 
> a seek. It can be hard to tell what the underlying thing being read from is 
> going to be in some of the contexts where we might choose to implement 
> switching to reading chunks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-04-01 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8670:
--
Attachment: OutputStreamBench.java

I updated the microbenchmark to do what I think is the right think WRT to dead 
code elimination and constant folding. Results don't change, but I it looks 
more like I know what we I am doing.

> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0
>
> Attachments: OutputStreamBench.java, largecolumn_test.py
>
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger than the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a high watermark and stay 
> there. This ends up costing 2x the largest cell size because there is a 
> buffer for input and output since they are different threads. This is further 
> multiplied by the number of nodes in the cluster - 1 since each has a 
> dedicated thread pair with separate thread locals.
> Anecdotally it appears that the cost is doubled beyond that although it isn't 
> clear why. Possibly the control connections or possibly there is some way in 
> which multiple 
> Need a workload in CI that tests the advertised limits of cells on a cluster. 
> It would be reasonable to ratchet down the max direct memory for the test to 
> trigger failures if a memory pooling issue is introduced. I don't think we 
> need to test concurrently pulling in a lot of them, but it should at least 
> work serially.
> The obvious fix to address this issue would be to read in smaller chunks when 
> dealing with large values. I think small should still be relatively large (4 
> megabytes) so that code that is reading from a disk can amortize the cost of 
> a seek. It can be hard to tell what the underlying thing being read from is 
> going to be in some of the contexts where we might choose to implement 
> switching to reading chunks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8670:
--
Reviewer: Benedict

> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0
>
> Attachments: largecolumn_test.py
>
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger than the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a high watermark and stay 
> there. This ends up costing 2x the largest cell size because there is a 
> buffer for input and output since they are different threads. This is further 
> multiplied by the number of nodes in the cluster - 1 since each has a 
> dedicated thread pair with separate thread locals.
> Anecdotally it appears that the cost is doubled beyond that although it isn't 
> clear why. Possibly the control connections or possibly there is some way in 
> which multiple 
> Need a workload in CI that tests the advertised limits of cells on a cluster. 
> It would be reasonable to ratchet down the max direct memory for the test to 
> trigger failures if a memory pooling issue is introduced. I don't think we 
> need to test concurrently pulling in a lot of them, but it should at least 
> work serially.
> The obvious fix to address this issue would be to read in smaller chunks when 
> dealing with large values. I think small should still be relatively large (4 
> megabytes) so that code that is reading from a disk can amortize the cost of 
> a seek. It can be hard to tell what the underlying thing being read from is 
> going to be in some of the contexts where we might choose to implement 
> switching to reading chunks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-23 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8670:
--
Attachment: largecolumn_test.py

Took way longer than I would have liked, but here is an alternative 
implementation of DataInputStream and DataOutputStreamPlus that wraps a 
WritableByteChannel and does any necessary buffering.  

[Implementation available on  
github.|https://github.com/apache/cassandra/compare/trunk...aweisberg:C-8670?expand=1]

I am also attaching a dtest validates that almost no direct byte buffer memory 
is allocated even when using large columns. To check how much is allocated I 
used reflection on java.nio.Bits and have GCInspector supply it along with the 
other metrics it supplies.

To make it easy to test for I added a -D flag for testing that has Netty not 
pool memory and prefer non-direct byte buffers.

The only other place where I think we might run into this issue is streaming. 
That operates on the input/output streams from sockets. With streaming you 
don't connect to as many nodes, and if the thread that is used for streaming is 
released once streaming completes is shouldn't be a problem.


> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0
>
> Attachments: largecolumn_test.py
>
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger than the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a high watermark and stay 
> there. This ends up costing 2x the largest cell size because there is a 
> buffer for input and output since they are different threads. This is further 
> multiplied by the number of nodes in the cluster - 1 since each has a 
> dedicated thread pair with separate thread locals.
> Anecdotally it appears that the cost is doubled beyond that although it isn't 
> clear why. Possibly the control connections or possibly there is some way in 
> which multiple 
> Need a workload in CI that tests the advertised limits of cells on a cluster. 
> It would be reasonable to ratchet down the max direct memory for the test to 
> trigger failures if a memory pooling issue is introduced. I don't think we 
> need to test concurrently pulling in a lot of them, but it should at least 
> work serially.
> The obvious fix to address this issue would be to read in smaller chunks when 
> dealing with large values. I think small should still be relatively large (4 
> megabytes) so that code that is reading from a disk can amortize the cost of 
> a seek. It can be hard to tell what the underlying thing being read from is 
> going to be in some of the contexts where we might choose to implement 
> switching to reading chunks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-12 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8670:
--
Description: 
If you provide a large byte array to NIO and ask it to populate the byte array 
from a socket it will allocate a thread local byte buffer that is the size of 
the requested read no matter how large it is. Old IO wraps new IO for sockets 
(but not files) so old IO is effected as well.

Even If you are using Buffered{Input | Output}Stream you can end up passing a 
large byte array to NIO. The byte array read method will pass the array to NIO 
directly if it is larger than the internal buffer.  

Passing large cells between nodes as part of intra-cluster messaging can cause 
the NIO pooled buffers to quickly reach a high watermark and stay there. This 
ends up costing 2x the largest cell size because there is a buffer for input 
and output since they are different threads. This is further multiplied by the 
number of nodes in the cluster - 1 since each has a dedicated thread pair with 
separate thread locals.

Anecdotally it appears that the cost is doubled beyond that although it isn't 
clear why. Possibly the control connections or possibly there is some way in 
which multiple 

Need a workload in CI that tests the advertised limits of cells on a cluster. 
It would be reasonable to ratchet down the max direct memory for the test to 
trigger failures if a memory pooling issue is introduced. I don't think we need 
to test concurrently pulling in a lot of them, but it should at least work 
serially.

The obvious fix to address this issue would be to read in smaller chunks when 
dealing with large values. I think small should still be relatively large (4 
megabytes) so that code that is reading from a disk can amortize the cost of a 
seek. It can be hard to tell what the underlying thing being read from is going 
to be in some of the contexts where we might choose to implement switching to 
reading chunks.

  was:
If you provide a large byte array to NIO and ask it to populate the byte array 
from a socket it will allocate a thread local byte buffer that is the size of 
the requested read no matter how large it is. Old IO wraps new IO for sockets 
(but not files) so old IO is effected as well.

Even If you are using Buffered{Input | Output}Stream you can end up passing a 
large byte array to NIO. The byte array read method will pass the array to NIO 
directly if it is larger then the internal buffer.  

Passing large cells between nodes as part of intra-cluster messaging can cause 
the NIO pooled buffers to quickly reach a high watermark and stay there. This 
ends up costing 2x the largest cell size because there is a buffer for input 
and output since they are different threads. This is further multiplied by the 
number of nodes in the cluster - 1 since each has a dedicated thread pair with 
separate thread locals.

Anecdotally it appears that the cost is doubled beyond that although it isn't 
clear why. Possibly the control connections or possibly there is some way in 
which multiple 

Need a workload in CI that tests the advertised limits of cells on a cluster. 
It would be reasonable to ratchet down the max direct memory for the test to 
trigger failures if a memory pooling issue is introduced. I don't think we need 
to test concurrently pulling in a lot of them, but it should at least work 
serially.

The obvious fix to address this issue would be to read in smaller chunks when 
dealing with large values. I think small should still be relatively large (4 
megabytes) so that code that is reading from a disk can amortize the cost of a 
seek. It can be hard to tell what the underlying thing being read from is going 
to be in some of the contexts where we might choose to implement switching to 
reading chunks.


> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0
>
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger than the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a

[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-07 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8670:
-
Fix Version/s: 3.0

> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0
>
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger then the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a high watermark and stay 
> there. This ends up costing 2x the largest cell size because there is a 
> buffer for input and output since they are different threads. This is further 
> multiplied by the number of nodes in the cluster - 1 since each has a 
> dedicated thread pair with separate thread locals.
> Anecdotally it appears that the cost is doubled beyond that although it isn't 
> clear why. Possibly the control connections or possibly there is some way in 
> which multiple 
> Need a workload in CI that tests the advertised limits of cells on a cluster. 
> It would be reasonable to ratchet down the max direct memory for the test to 
> trigger failures if a memory pooling issue is introduced. I don't think we 
> need to test concurrently pulling in a lot of them, but it should at least 
> work serially.
> The obvious fix to address this issue would be to read in smaller chunks when 
> dealing with large values. I think small should still be relatively large (4 
> megabytes) so that code that is reading from a disk can amortize the cost of 
> a seek. It can be hard to tell what the underlying thing being read from is 
> going to be in some of the contexts where we might choose to implement 
> switching to reading chunks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-01-23 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8670:
--
Description: 
If you provide a large byte array to NIO and ask it to populate the byte array 
from a socket it will allocate a thread local byte buffer that is the size of 
the requested read no matter how large it is. Old IO wraps new IO for sockets 
(but not files) so old IO is effected as well.

Even If you are using Buffered{Input | Output}Stream you can end up passing a 
large byte array to NIO. The byte array read method will pass the array to NIO 
directly if it is larger then the internal buffer.  

Passing large cells between nodes as part of intra-cluster messaging can cause 
the NIO pooled buffers to quickly reach a high watermark and stay there. This 
ends up costing 2x the largest cell size because there is a buffer for input 
and output since they are different threads. This is further multiplied by the 
number of nodes in the cluster - 1 since each has a dedicated thread pair with 
separate thread locals.

Anecdotally it appears that the cost is doubled beyond that although it isn't 
clear why. Possibly the control connections or possibly there is some way in 
which multiple 

Need a workload in CI that tests the advertised limits of cells on a cluster. 
It would be reasonable to ratchet down the max direct memory for the test to 
trigger failures if a memory pooling issue is introduced. I don't think we need 
to test concurrently pulling in a lot of them, but it should at least work 
serially.

The obvious fix to address this issue would be to read in smaller chunks when 
dealing with large values. I think small should still be relatively large (4 
megabytes) so that code that is reading from a disk can amortize the cost of a 
seek. It can be hard to tell what the underlying thing being read from is going 
to be in some of the contexts where we might choose to implement switching to 
reading chunks.

  was:
If you provide a large byte array to NIO and ask it to populate the byte array 
from a socket it will allocate a thread local byte buffer that is the size of 
the requested read no matter how large it is. Old IO wraps new IO for sockets 
(but not files) so old IO is effected as well.

Even If you are using Buffered{Input | Output}Stream you can end up passing a 
large byte array to NIO. The byte array read method will pass the array to NIO 
directly if it is larger then the internal buffer.  

Passing large cells between nodes as part of intra-cluster messaging can cause 
the NIO pooled buffers to quickly reach a high watermark and stay there. This 
ends up costing 2x the largest cell size because there is a buffer for input 
and output since they are different threads. This is further multiplied by the 
number of nodes in the cluster - 1 since each has a dedicated thread pair with 
separate thread locals.

Anecdotally it appears that the cost is doubled beyond that although it isn't 
clear why. Possibly the control connections or possibly there is some way in 
which multiple 

Need a workload in CI that tests the advertised limits of cells on a cluster. 
It would be reasonable to ratchet down the max direct memory for the test to 
trigger failures if a memory pooling issue is introduced. I don't think we need 
to test concurrently pulling in a lot of them, but it should at least work 
serially.


> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger then the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a high watermark and stay 
> there. This ends up costing 2x the largest cell size because there is a 
> buffer for input and output since they are different threads. This is further 
> multiplied by the number of nodes in the cluster - 1 since each has a 
> dedicated thread pair with separate thread locals.
> Anecdotally it appears that the cost is doubled beyond that although it isn't 
> clear why. Possibly the control connections or possi