[jira] [Commented] (CASSANDRA-11838) dtest failure in largecolumn_test:TestLargeColumn.cleanup_test

2016-06-23 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346499#comment-15346499
 ] 

Philip Thompson commented on CASSANDRA-11838:
-

So this test doesn't appear to have ever stopped failing. You want a new ticket?

> dtest failure in largecolumn_test:TestLargeColumn.cleanup_test
> --
>
> Key: CASSANDRA-11838
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11838
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Alex Petrov
>  Labels: dtest
> Fix For: 3.8
>
> Attachments: node1.log, node1_debug.log, node2.log, node2_debug.log
>
>
> Example failure at:
> http://cassci.datastax.com/job/trunk_offheap_dtest/200/testReport/largecolumn_test/TestLargeColumn/cleanup_test/
> node 1 contains the following OOM in its log:
> {code}
> ERROR [SharedPool-Worker-1] 2016-05-16 22:54:10,112 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0xb97f2640, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:48190]
> java.lang.OutOfMemoryError: Java heap space
>   at org.apache.cassandra.transport.CBUtil.readRawBytes(CBUtil.java:533) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.CBUtil.readBoundValue(CBUtil.java:407) 
> ~[main/:na]
>   at org.apache.cassandra.transport.CBUtil.readValueList(CBUtil.java:462) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:417) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:365) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:45)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:41)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:280)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:261)
>  ~[main/:na]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:277)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:264)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:962)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:879)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:360) 
> ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:276) 
> ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> ERROR [SharedPool-Worker-1] 2016-05-16 22:54:10,756 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0xba0be401, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:48191]
> 

[jira] [Commented] (CASSANDRA-11838) dtest failure in largecolumn_test:TestLargeColumn.cleanup_test

2016-06-01 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309434#comment-15309434
 ] 

Alex Petrov commented on CASSANDRA-11838:
-

I've been trying to get rid of the `DataOutputBuffer` instances, although there 
always was a reference hanging somewhere. [~tjake] pointed out that it may be 
happening when the instance is recycled once but not again. 

I've opened a [PR|https://github.com/netty/netty/pull/5333] (already merged 
thanks to [~norman]). In short: large / grown instances were not recycled by 
us, but were hanging in the array backing the {{Stack}}. 

I'd suggest let's trim buffers to {128} bytes (default initial value) when the 
instances are > 1mb until we upgrade to the version where we can throw away the 
recycled instances. And create a follow-up ticket to make sure we remove custom 
code that trims buffers and can throw the instances away. The second point 
(with {{fill}}) is addressed, too. I've triggered tests (including one with 
running the failing test many times):

|[trunk|https://github.com/ifesdjeen/cassandra/tree/11838-trunk] 
|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11838-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11838-trunk-dtest/]|[multiplexed|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11838-trunk-multiplexed/]|
 

> dtest failure in largecolumn_test:TestLargeColumn.cleanup_test
> --
>
> Key: CASSANDRA-11838
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11838
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Alex Petrov
>  Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node2.log, node2_debug.log
>
>
> Example failure at:
> http://cassci.datastax.com/job/trunk_offheap_dtest/200/testReport/largecolumn_test/TestLargeColumn/cleanup_test/
> node 1 contains the following OOM in its log:
> {code}
> ERROR [SharedPool-Worker-1] 2016-05-16 22:54:10,112 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0xb97f2640, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:48190]
> java.lang.OutOfMemoryError: Java heap space
>   at org.apache.cassandra.transport.CBUtil.readRawBytes(CBUtil.java:533) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.CBUtil.readBoundValue(CBUtil.java:407) 
> ~[main/:na]
>   at org.apache.cassandra.transport.CBUtil.readValueList(CBUtil.java:462) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:417) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:365) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:45)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:41)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:280)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:261)
>  ~[main/:na]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:277)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:264)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> 

[jira] [Commented] (CASSANDRA-11838) dtest failure in largecolumn_test:TestLargeColumn.cleanup_test

2016-05-31 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307918#comment-15307918
 ] 

T Jake Luciani commented on CASSANDRA-11838:


Nice detective work.

Nits:  

  * We should probably drop the buffer on the floor vs re-allocate then 
recycle. My reasoning is if there are rows >1Mb there is probably some strange 
workload where it's going to need to re-allocate to > 1Mb anyway. So rather 
than allocating twice just don't recycle the ones >1Mb.

  * Prefer you use Arrays.fill vs for loop for a simple clear

> dtest failure in largecolumn_test:TestLargeColumn.cleanup_test
> --
>
> Key: CASSANDRA-11838
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11838
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Alex Petrov
>  Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node2.log, node2_debug.log
>
>
> Example failure at:
> http://cassci.datastax.com/job/trunk_offheap_dtest/200/testReport/largecolumn_test/TestLargeColumn/cleanup_test/
> node 1 contains the following OOM in its log:
> {code}
> ERROR [SharedPool-Worker-1] 2016-05-16 22:54:10,112 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0xb97f2640, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:48190]
> java.lang.OutOfMemoryError: Java heap space
>   at org.apache.cassandra.transport.CBUtil.readRawBytes(CBUtil.java:533) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.CBUtil.readBoundValue(CBUtil.java:407) 
> ~[main/:na]
>   at org.apache.cassandra.transport.CBUtil.readValueList(CBUtil.java:462) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:417) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:365) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:45)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:41)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:280)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:261)
>  ~[main/:na]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:277)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:264)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:962)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:879)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:360) 
> ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:276) 
> ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>  

[jira] [Commented] (CASSANDRA-11838) dtest failure in largecolumn_test:TestLargeColumn.cleanup_test

2016-05-29 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306065#comment-15306065
 ] 

Alex Petrov commented on CASSANDRA-11838:
-

I've ran a git bisect and it confirms the commit 
[1e92ce|https://github.com/apache/cassandra/commit/1e92ce43a5a730f81d3f6cfd72e7f4b126db788a]
 same mentioned [here|http://cassci.datastax.com/job/trunk_offheap_dtest/185/]. 
The test is failing also without {{offheap_memtables}}.

The patch is cleaning up the references to values within the reused {{BTree}} 
and "trims" the large values within {{DataOutputBuffer}}. Since the buffer is 
growing and there's a setup under which the value may grow large enough, even 
if it's just read once, it'll remain there, so we need to clean it up. 
Unfortunately, we can't just skip recycling for such objects, as there's no way 
to pass constructor options (which may be a good thing as {{Recycler}} assumes 
that all instances are the same, so it can pick whichever is available), so we 
have to force-trim (in this case, re-allocate the buffer to some threshold 
value).

> dtest failure in largecolumn_test:TestLargeColumn.cleanup_test
> --
>
> Key: CASSANDRA-11838
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11838
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Alex Petrov
>  Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node2.log, node2_debug.log
>
>
> Example failure at:
> http://cassci.datastax.com/job/trunk_offheap_dtest/200/testReport/largecolumn_test/TestLargeColumn/cleanup_test/
> node 1 contains the following OOM in its log:
> {code}
> ERROR [SharedPool-Worker-1] 2016-05-16 22:54:10,112 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0xb97f2640, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:48190]
> java.lang.OutOfMemoryError: Java heap space
>   at org.apache.cassandra.transport.CBUtil.readRawBytes(CBUtil.java:533) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.CBUtil.readBoundValue(CBUtil.java:407) 
> ~[main/:na]
>   at org.apache.cassandra.transport.CBUtil.readValueList(CBUtil.java:462) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:417) 
> ~[main/:na]
>   at 
> org.apache.cassandra.cql3.QueryOptions$Codec.decode(QueryOptions.java:365) 
> ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:45)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.messages.ExecuteMessage$1.decode(ExecuteMessage.java:41)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:280)
>  ~[main/:na]
>   at 
> org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:261)
>  ~[main/:na]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:277)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:264)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:962)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:879)
>  ~[netty-all-4.0.36.Final.jar:4.0.36.Final]
>   at 
>