[jira] [Updated] (CASSANDRA-18559) Upgrade to 4.1.1 fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-18559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-18559: --- Since Version: 4.1.1 > Upgrade to 4.1.1 fails with NullPointerException > > > Key: CASSANDRA-18559 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18559 > Project: Cassandra > Issue Type: Bug >Reporter: Eric Evans >Priority: Normal > > When upgrading from 3.11.14 to 4.1.1 —and when {{internode_encryption}} is > one of {{dc}} or {{{}rack{}}}— startup fails with an NPE. > > {noformat} > io.netty.handler.codec.DecoderException: java.lang.NullPointerException > at > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) > at > io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(ByteToMessageDecoder.java:253) > at > io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:515) > at > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) > at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) > at > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.locator.GossipingPropertyFileSnitch.getRack(GossipingPropertyFileSnitch.java:116) > at > org.apache.cassandra.locator.DynamicEndpointSnitch.getRack(DynamicEndpointSnitch.java:162) > at > org.apache.cassandra.config.EncryptionOptions$ServerEncryptionOptions.shouldEncrypt(EncryptionOptions.java:682) > at > org.apache.cassandra.net.InboundConnectionInitiator$Handler.isEncryptionRequired(InboundConnectionInitiator.java:363) > at > org.apache.cassandra.net.InboundConnectionInitiator$Handler.initiate(InboundConnectionInitiator.java:278) > at > org.apache.cassandra.net.InboundConnectionInitiator$Handler.decode(InboundConnectionInitiator.java:265) > at > io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508) > at > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) > ... 22 common frames omitted > {noformat} > > {noformat} > io.netty.handler.codec.DecoderException: java.lang.NullPointerException > at > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) > at > io.netty.handler.codec.ByteToMessageDecoder.hand
[jira] [Updated] (CASSANDRA-18559) Upgrade to 4.1.1 fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-18559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-18559: --- Description: When upgrading from 3.11.14 to 4.1.1 —and when {{internode_encryption}} is one of {{dc}} or {{{}rack{}}}— startup fails with an NPE. {noformat} io.netty.handler.codec.DecoderException: java.lang.NullPointerException at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(ByteToMessageDecoder.java:253) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:515) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.NullPointerException: null at org.apache.cassandra.locator.GossipingPropertyFileSnitch.getRack(GossipingPropertyFileSnitch.java:116) at org.apache.cassandra.locator.DynamicEndpointSnitch.getRack(DynamicEndpointSnitch.java:162) at org.apache.cassandra.config.EncryptionOptions$ServerEncryptionOptions.shouldEncrypt(EncryptionOptions.java:682) at org.apache.cassandra.net.InboundConnectionInitiator$Handler.isEncryptionRequired(InboundConnectionInitiator.java:363) at org.apache.cassandra.net.InboundConnectionInitiator$Handler.initiate(InboundConnectionInitiator.java:278) at org.apache.cassandra.net.InboundConnectionInitiator$Handler.decode(InboundConnectionInitiator.java:265) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) ... 22 common frames omitted {noformat} {noformat} io.netty.handler.codec.DecoderException: java.lang.NullPointerException at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(ByteToMessageDecoder.java:253) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:515) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandle
[jira] [Created] (CASSANDRA-18559) Upgrade to 4.1.1 fails with NullPointerException
Eric Evans created CASSANDRA-18559: -- Summary: Upgrade to 4.1.1 fails with NullPointerException Key: CASSANDRA-18559 URL: https://issues.apache.org/jira/browse/CASSANDRA-18559 Project: Cassandra Issue Type: Bug Reporter: Eric Evans When upgrading from 3.11.14 to 4.1.1 —and when {{internode_encryption}} is one of {{dc}} or {{{}rack{}}}— startup fails with an NPE. {noformat} io.netty.handler.codec.DecoderException: java.lang.NullPointerException at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(ByteToMessageDecoder.java:253) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:515) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.NullPointerException: null at org.apache.cassandra.locator.GossipingPropertyFileSnitch.getRack(GossipingPropertyFileSnitch.java:116) at org.apache.cassandra.locator.DynamicEndpointSnitch.getRack(DynamicEndpointSnitch.java:162) at org.apache.cassandra.config.EncryptionOptions$ServerEncryptionOptions.shouldEncrypt(EncryptionOptions.java:682) at org.apache.cassandra.net.InboundConnectionInitiator$Handler.isEncryptionRequired(InboundConnectionInitiator.java:363) at org.apache.cassandra.net.InboundConnectionInitiator$Handler.initiate(InboundConnectionInitiator.java:278) at org.apache.cassandra.net.InboundConnectionInitiator$Handler.decode(InboundConnectionInitiator.java:265) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) ... 22 common frames omitted {noformat} {noformat} io.netty.handler.codec.DecoderException: java.lang.NullPointerException at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(B
[jira] [Commented] (CASSANDRA-13984) dead Small OutboundTcpConnection to restarted nodes blocking hint delivery
[ https://issues.apache.org/jira/browse/CASSANDRA-13984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716054#comment-17716054 ] Eric Evans commented on CASSANDRA-13984: I can confirm this issue; I am seeing what is described here on a 3.11.13 cluster, and I believe it is rather more serious than a 20 minute delay in hints delivery. We have a number of clusters (all 3.11.13), but this manifests only on one of them (unfortunately, our most critical use-case). The cluster in question has 3 nodes in each of 2 data-centers, replication factor of 3 per data-center (6 total). We do LOCAL_QUORUM reads & writes, and EACH_QUORUM deletes. When this happens, the affected node is for all intents and purposes partitioned from the rest, and clients making queries to it are unable to obtain quorum. The only remedy (short of waiting 20 minutes) is to restart the other nodes in the cluster, which exacerbates the issue. I can make our logs available, but they look like those posted in the description. The other nodes of the cluster see the shutdown, terminate the outbound messaging connection, and then immediately attempt to reconnect. If you are doing a simple restart, then the affected node seems to come back up in time for this connection to succeed, if not (if for example you are rebooting the host), then these outbound connections must (apparently) timeout and reconnect (20 minutes later). > dead Small OutboundTcpConnection to restarted nodes blocking hint delivery > -- > > Key: CASSANDRA-13984 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13984 > Project: Cassandra > Issue Type: Bug > Components: Local/Startup and Shutdown > Environment: - We run entirely a counter based workload the clusters > we've been seeing this issue on (although it might happen for non counter > clusters). > - We originally had a 48 node cassandra 3.7 cluster (averaging ~500GB of load > on each node) running on c4.4xlarge instances in EC2. > - We've since split this cluster into two cassandra 3.11.0 clusters (for > other performance reasons), one with 1/3 of our dataset on 48 c4.4xlarge > instances, and the other 2/3 of our data set onto a 96 node c4.4xlarge > cluster. > - All of these clusters run on ubuntu 14.04 with enhanced networking enabled. >Reporter: John Meichle >Priority: Normal > Attachments: 3.11_node_restart_with_disablegossip.svg, > 3.11_node_restart_without_disablegossip.svg > > > Hello. For the last two months we've been fighting performance issues > relating to node restarts and hint playback, and were able to get a pretty > good bit of proof for the issue last night when debugging one of these > restarts. > The main issue we've been fighting with these clusters is very slow and > unstable node restarts which seem to be due to hint playback with logs > indicating "Finished hinted handoff of file ... hints to endpoint: 1-2-3-4, > partially". We've posted about this on the mailing list as well and this bug > seems to be the cause of this issue > https://lists.apache.org/thread.html/41e9cb7b20696dd177fe925aba30d372614fcfe11d98275c7d9579cc@%3Cuser.cassandra.apache.org%3E > Our process for restarting nodes is to run nodetool drain and then restart > the service via init. When originally watching the logs we saw on the > restarting node the standard cassandra startup process of initializing > keyspaces, loading sstables, and finally starting to handshake with the > cluster, which takes about 5 minutes. After this, the logs are very quiet > until 15 minutes later. > During this 15 minute period, some peers (between 0 and half of the cluster) > are reported as DN status by nodetool status. > When checking one of the nodes that is reported as DN, we see hint playback > logging lines such as: > {code} > INFO [HintsDispatcher:12] 2017-10-31 01:06:18,876 > HintsDispatchExecutor.java:289 - Finished hinted handoff of file > 8724f664-dff1-4c20-887b-6a26ae54a9b5-1509410768866-1.hints to endpoint > /20.0.131.175: 8724f664-dff1-4c20-887b-6a26ae54a9b5, partially > {code} > We traced the codepath from this log line, starting in HintsDispatcher. It > appears the callback that is created for the hint sending is timing out as we > verified this the JMX metrics for the HintsService: > {code} > org.apache.cassandra.metrics:name=HintsFailed,type=HintsService > org.apache.cassandra.metrics:name=HintsSucceeded,type=HintsService > org.apache.cassandra.metrics:name=HintsTimedOut,type=HintsService > {code} > in which HintsTimedOut was incrementing during this period. We suspected this > might be due to the restarting node being overloaded, as we do see heavy IO > on the restart. However that IO pattern is not for the complete durati
[jira] [Commented] (CASSANDRA-17773) Incorrect cassandra.logdir on Debian systems
[ https://issues.apache.org/jira/browse/CASSANDRA-17773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572099#comment-17572099 ] Eric Evans commented on CASSANDRA-17773: Prompting during the package install should always be avoided if possible; It's really common for packages like this to be installed unattended, or without a controlling tty. I'm not sure what the sanest/easiest way is to ensure upgrades between the affected versions here, but longer term it would be nice if we could harmonize the assignment of these environment variables. Ideally there'd be exactly one place where we checked for an unset env var and assigned a reasonable default, and cassandra-env.sh (or similar) could be made more like jvm.options. > Incorrect cassandra.logdir on Debian systems > > > Key: CASSANDRA-17773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17773 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Eric Evans >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1-rc > > > The Debian packaging patches bin/cassandra to use /var/log/cassandra for > logs, it does so conditionally however, only if CASSANDRA_LOG_DIR is unset. > This occurs _after_ cassandra-env.sh is sourced though, which also sets > CASSANDRA_LOG_DIR if unset (to $CASSANDRA_HOME/logs). The result is that > -Dcassandra.lodir is set to /usr/share/cassandra/logs on Debian systems. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17773) Incorrect cassandra.logdir on Debian systems
[ https://issues.apache.org/jira/browse/CASSANDRA-17773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571680#comment-17571680 ] Eric Evans commented on CASSANDRA-17773: {quote}cassandra-env.sh is sourced first, and if CASSANDRA_LOG_DIR is unset then sets it to /var/log/cassandra (not to $CASSANDRA_HOME/logs as the description states) See [https://github.com/apache/cassandra/blob/trunk/debian/patches/cassandra_logdir_fix.diff#L19] Are we sure that this isn't a case of an old cassandra-env.sh file being kept in a newer installation…? {quote} Good catch [~mck], that is what happened here (I have a modified/unpatched {{cassandra-env.sh}}). I think this still counts as buggy though: {{cassandra-env.sh}} is marked as a conffile, as such it won't be overwritten (by the patched version) at install time. > Incorrect cassandra.logdir on Debian systems > > > Key: CASSANDRA-17773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17773 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Eric Evans >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1-rc > > > The Debian packaging patches bin/cassandra to use /var/log/cassandra for > logs, it does so conditionally however, only if CASSANDRA_LOG_DIR is unset. > This occurs _after_ cassandra-env.sh is sourced though, which also sets > CASSANDRA_LOG_DIR if unset (to $CASSANDRA_HOME/logs). The result is that > -Dcassandra.lodir is set to /usr/share/cassandra/logs on Debian systems. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17773) Incorrect cassandra.logdir on Debian systems
Eric Evans created CASSANDRA-17773: -- Summary: Incorrect cassandra.logdir on Debian systems Key: CASSANDRA-17773 URL: https://issues.apache.org/jira/browse/CASSANDRA-17773 Project: Cassandra Issue Type: Bug Components: Packaging Reporter: Eric Evans The Debian packaging patches bin/cassandra to use /var/log/cassandra for logs, it does so conditionally however, only if CASSANDRA_LOG_DIR is unset. This occurs _after_ cassandra-env.sh is sourced though, which also sets CASSANDRA_LOG_DIR if unset (to $CASSANDRA_HOME/logs). The result is that -Dcassandra.lodir is set to /usr/share/cassandra/logs on Debian systems. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17689) Documentation uses IP:port example for seeds list
Eric Evans created CASSANDRA-17689: -- Summary: Documentation uses IP:port example for seeds list Key: CASSANDRA-17689 URL: https://issues.apache.org/jira/browse/CASSANDRA-17689 Project: Cassandra Issue Type: Bug Components: Documentation Reporter: Eric Evans The documentation erroneously provides a verbatim snippet of configuration file that suggests you can specify seeds in the form IP:port. See: https://cassandra.apache.org/doc/trunk/cassandra/configuration/cass_yaml_file.html#seed_provider -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14797) CQLSSTableWriter does not support DELETE
[ https://issues.apache.org/jira/browse/CASSANDRA-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634664#comment-16634664 ] Eric Evans commented on CASSANDRA-14797: Convention for posting patches/branches/tests notwithstanding... ||3.11||trunk|| |[branch|https://github.com/eevans/cassandra/tree/14797-3.11]|[branch|https://github.com/eevans/cassandra/tree/14797-trunk]| |[utest|https://circleci.com/gh/eevans/cassandra/3]|[utest|https://circleci.com/gh/eevans/cassandra/6]| > CQLSSTableWriter does not support DELETE > > > Key: CASSANDRA-14797 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14797 > Project: Cassandra > Issue Type: Bug > Components: Libraries >Reporter: Eric Evans >Priority: Minor > Fix For: 3.11.4, 4.x > > > {{CQLSSTableWriter}} doesn't work with {{DELETE}} statements, and ought to. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14797) CQLSSTableWriter does not support DELETE
[ https://issues.apache.org/jira/browse/CASSANDRA-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634548#comment-16634548 ] Eric Evans commented on CASSANDRA-14797: Patch forthcoming (as soon as I suss out the state-of-the-art for this) > CQLSSTableWriter does not support DELETE > > > Key: CASSANDRA-14797 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14797 > Project: Cassandra > Issue Type: Bug > Components: Libraries >Reporter: Eric Evans >Priority: Minor > Fix For: 3.11.4, 4.x > > > {{CQLSSTableWriter}} doesn't work with {{DELETE}} statements, and ought to. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14797) CQLSSTableWriter does not support DELETE
Eric Evans created CASSANDRA-14797: -- Summary: CQLSSTableWriter does not support DELETE Key: CASSANDRA-14797 URL: https://issues.apache.org/jira/browse/CASSANDRA-14797 Project: Cassandra Issue Type: Bug Components: Libraries Reporter: Eric Evans Fix For: 3.11.4, 4.x {{CQLSSTableWriter}} doesn't work with {{DELETE}} statements, and ought to. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16630407#comment-16630407 ] Eric Evans commented on CASSANDRA-14793: Awesome; I'd been meaning to open this very ticket. I had planned to suggest what [~krummas] did, that it be possible to put {{system}} in a different data directory. At least if this were possible, {{system}} could be put on a RAID. And, at least in our environments, if the expectation is that you can survive a single device failure, then the OS is likely already on RAID-1 or similar. Of course, if the tables in {{system}} could be regenerated, that would be better still but I'm not sure what that looks like complexity-wise versus pinning it. > Improve system table handling when losing a disk when using JBOD > > > Key: CASSANDRA-14793 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14793 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Priority: Major > Fix For: 4.0 > > > We should improve the way we handle disk failures when losing a disk in a > JBOD setup > One way could be to pin the system tables to a special data directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14355) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426397#comment-16426397 ] Eric Evans commented on CASSANDRA-14355: Our production cluster is on 3.11.0 and is running fine, yes. This staging environment was first setup on 3.11.0 where these OOMs started, and has since been upgraded to 3.11.2. Both versions OOM (regularly, on about 1.5 hour intervals, in fact). > Memory leak > --- > > Key: CASSANDRA-14355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14355 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian Jessie, OpenJDK 1.8.0_151 >Reporter: Eric Evans >Priority: Major > Fix For: 3.11.3 > > Attachments: 01_Screenshot from 2018-04-04 14-24-00.png, > 02_Screenshot from 2018-04-04 14-28-33.png, 03_Screenshot from 2018-04-04 > 14-24-50.png > > > We're seeing regular, frequent {{OutOfMemoryError}} exceptions. Similar to > CASSANDRA-13754, an analysis of the heap dumps shows the heap consumed by the > {{threadLocals}} member of the instances of > {{io.netty.util.concurrent.FastThreadLocalThread}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14355) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426179#comment-16426179 ] Eric Evans commented on CASSANDRA-14355: Also, I can make a heap available, but be warned, they are 12GB in size. > Memory leak > --- > > Key: CASSANDRA-14355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14355 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian Jessie, OpenJDK 1.8.0_151 >Reporter: Eric Evans >Priority: Major > Fix For: 3.11.3 > > Attachments: 01_Screenshot from 2018-04-04 14-24-00.png, > 02_Screenshot from 2018-04-04 14-28-33.png, 03_Screenshot from 2018-04-04 > 14-24-50.png > > > We're seeing regular, frequent {{OutOfMemoryError}} exceptions. Similar to > CASSANDRA-13754, an analysis of the heap dumps shows the heap consumed by the > {{threadLocals}} member of the instances of > {{io.netty.util.concurrent.FastThreadLocalThread}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14355) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426178#comment-16426178 ] Eric Evans commented on CASSANDRA-14355: {quote}[~urandom] Did this just start happening again or is this a continuation of the issues you saw with CASSANDRA-13754? {quote} This is happening in a newly minted staging cluster, and is (readily) reproducible on both the same 3.11.0 build we use in production, and 3.11.2. I am (so far) at a loss to understand what is different about this environment. > Memory leak > --- > > Key: CASSANDRA-14355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14355 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian Jessie, OpenJDK 1.8.0_151 >Reporter: Eric Evans >Priority: Major > Fix For: 3.11.3 > > Attachments: 01_Screenshot from 2018-04-04 14-24-00.png, > 02_Screenshot from 2018-04-04 14-28-33.png, 03_Screenshot from 2018-04-04 > 14-24-50.png > > > We're seeing regular, frequent {{OutOfMemoryError}} exceptions. Similar to > CASSANDRA-13754, an analysis of the heap dumps shows the heap consumed by the > {{threadLocals}} member of the instances of > {{io.netty.util.concurrent.FastThreadLocalThread}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14355) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-14355: --- Attachment: 03_Screenshot from 2018-04-04 14-24-50.png 02_Screenshot from 2018-04-04 14-28-33.png 01_Screenshot from 2018-04-04 14-24-00.png > Memory leak > --- > > Key: CASSANDRA-14355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14355 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian Jessie, OpenJDK 1.8.0_151 >Reporter: Eric Evans >Priority: Major > Fix For: 3.11.3 > > Attachments: 01_Screenshot from 2018-04-04 14-24-00.png, > 02_Screenshot from 2018-04-04 14-28-33.png, 03_Screenshot from 2018-04-04 > 14-24-50.png > > > We're seeing regular, frequent {{OutOfMemoryError}} exceptions. Similar to > CASSANDRA-13754, an analysis of the heap dumps shows the heap consumed by the > {{threadLocals}} member of the instances of > {{io.netty.util.concurrent.FastThreadLocalThread}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14355) Memory leak
Eric Evans created CASSANDRA-14355: -- Summary: Memory leak Key: CASSANDRA-14355 URL: https://issues.apache.org/jira/browse/CASSANDRA-14355 Project: Cassandra Issue Type: Bug Components: Core Environment: Debian Jessie, OpenJDK 1.8.0_151 Reporter: Eric Evans Fix For: 3.11.3 We're seeing regular, frequent {{OutOfMemoryError}} exceptions. Similar to CASSANDRA-13754, an analysis of the heap dumps shows the heap consumed by the {{threadLocals}} member of the instances of {{io.netty.util.concurrent.FastThreadLocalThread}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14097) Per-node stream concurrency
[ https://issues.apache.org/jira/browse/CASSANDRA-14097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280402#comment-16280402 ] Eric Evans commented on CASSANDRA-14097: bq. CASSANDRA-12229 started down the road of implementing concurrency in streaming. What specific things are you thinking about? Thanks [~jasobrown], I wasn't aware of CASSANDRA-12229 (or the issues it references)! Mainly, I'm thinking to avoid the scenario where available throughput is a function of how many nodes you're streaming from. If for example you have 3 nodes in 3 racks (1 node per rack), the bootstrap of an additional node will stream everything from just one other node (from whichever node it shares a rack with). Throughput can be very low as a result (particularly if compression is in use); In our environment, I seldom see more than 36Mbps per stream. CASSANDRA-4663 would solve this for me (because I have many keyspaces), but changing this from a function of "how many nodes" to "how many nodes and keyspaces", still seems less than ideal. > Per-node stream concurrency > --- > > Key: CASSANDRA-14097 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14097 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Eric Evans > > Stream sessions with a remote are bound to a single thread, and when > compression is in use this thread can be CPU bound, limiting throughput > considerably. When the number of nodes is small (i.e. when the number of > concurrent sessions is also low), rebuilds or bootstrap operations can take a > very long time. > Ideally, data could be streamed from any given remote concurrently. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14097) Per-node stream concurrency
[ https://issues.apache.org/jira/browse/CASSANDRA-14097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-14097: --- Component/s: Streaming and Messaging > Per-node stream concurrency > --- > > Key: CASSANDRA-14097 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14097 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Eric Evans > > Stream sessions with a remote are bound to a single thread, and when > compression is in use this thread can be CPU bound, limiting throughput > considerably. When the number of nodes is small (i.e. when the number of > concurrent sessions is also low), rebuilds or bootstrap operations can take a > very long time. > Ideally, data could be streamed from any given remote concurrently. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-11303) New inbound throughput parameters for streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-11303: --- Issue Type: New Feature (was: Improvement) > New inbound throughput parameters for streaming > --- > > Key: CASSANDRA-11303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11303 > Project: Cassandra > Issue Type: New Feature > Components: Configuration >Reporter: Satoshi Konno >Assignee: Satoshi Konno >Priority: Minor > Attachments: 11303_inbound_limit_debug_20160419.log, > 11303_inbound_nolimit_debug_20160419.log, > 11303_inbound_patch_for_trunk_20160419.diff, > 11303_inbound_patch_for_trunk_20160525.diff, > 11303_inbound_patch_for_trunk_20160704.diff, > 200vs40inboundstreamthroughput.png, cassandra_inbound_stream.diff > > > Hi, > To specify stream throughputs of a node more clearly, I would like to add the > following new inbound parameters like existing outbound parameters in the > cassandra.yaml. > - stream_throughput_inbound_megabits_per_sec > - inter_dc_stream_throughput_outbound_megabits_per_sec > We use only the existing outbound parameters now, but it is difficult to > control the total throughputs of a node. In our production network, some > critical alerts occurs when a node exceed the specified total throughput > which is the sum of the input and output throughputs. > In our operation of Cassandra, the alerts occurs during the bootstrap or > repair processing when a new node is added. In the worst case, we have to > stop the operation of the exceed node. > I have attached the patch under consideration. I would like to add a new > limiter class, StreamInboundRateLimiter, and use the limiter class in > StreamDeserializer class. I use Row::dataSize( )to get the input throughput > in StreamDeserializer::newPartition(), but I am not sure whether the > dataSize() returns the correct data size. > Can someone please tell me how to do it ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-11303) New inbound throughput parameters for streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-11303: --- Issue Type: Improvement (was: New Feature) > New inbound throughput parameters for streaming > --- > > Key: CASSANDRA-11303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11303 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Satoshi Konno >Assignee: Satoshi Konno >Priority: Minor > Attachments: 11303_inbound_limit_debug_20160419.log, > 11303_inbound_nolimit_debug_20160419.log, > 11303_inbound_patch_for_trunk_20160419.diff, > 11303_inbound_patch_for_trunk_20160525.diff, > 11303_inbound_patch_for_trunk_20160704.diff, > 200vs40inboundstreamthroughput.png, cassandra_inbound_stream.diff > > > Hi, > To specify stream throughputs of a node more clearly, I would like to add the > following new inbound parameters like existing outbound parameters in the > cassandra.yaml. > - stream_throughput_inbound_megabits_per_sec > - inter_dc_stream_throughput_outbound_megabits_per_sec > We use only the existing outbound parameters now, but it is difficult to > control the total throughputs of a node. In our production network, some > critical alerts occurs when a node exceed the specified total throughput > which is the sum of the input and output throughputs. > In our operation of Cassandra, the alerts occurs during the bootstrap or > repair processing when a new node is added. In the worst case, we have to > stop the operation of the exceed node. > I have attached the patch under consideration. I would like to add a new > limiter class, StreamInboundRateLimiter, and use the limiter class in > StreamDeserializer class. I use Row::dataSize( )to get the input throughput > in StreamDeserializer::newPartition(), but I am not sure whether the > dataSize() returns the correct data size. > Can someone please tell me how to do it ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14097) Per-node stream concurrency
[ https://issues.apache.org/jira/browse/CASSANDRA-14097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-14097: --- Issue Type: Improvement (was: Bug) > Per-node stream concurrency > --- > > Key: CASSANDRA-14097 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14097 > Project: Cassandra > Issue Type: Improvement >Reporter: Eric Evans > > Stream sessions with a remote are bound to a single thread, and when > compression is in use this thread can be CPU bound, limiting throughput > considerably. When the number of nodes is small (i.e. when the number of > concurrent sessions is also low), rebuilds or bootstrap operations can take a > very long time. > Ideally, data could be streamed from any given remote concurrently. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14097) Per-node stream concurrency
Eric Evans created CASSANDRA-14097: -- Summary: Per-node stream concurrency Key: CASSANDRA-14097 URL: https://issues.apache.org/jira/browse/CASSANDRA-14097 Project: Cassandra Issue Type: Bug Reporter: Eric Evans Stream sessions with a remote are bound to a single thread, and when compression is in use this thread can be CPU bound, limiting throughput considerably. When the number of nodes is small (i.e. when the number of concurrent sessions is also low), rebuilds or bootstrap operations can take a very long time. Ideally, data could be streamed from any given remote concurrently. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14074) Remove "OpenJDK is not recommended" Startup Warning
[ https://issues.apache.org/jira/browse/CASSANDRA-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16271073#comment-16271073 ] Eric Evans commented on CASSANDRA-14074: +1 > Remove "OpenJDK is not recommended" Startup Warning > --- > > Key: CASSANDRA-14074 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14074 > Project: Cassandra > Issue Type: Improvement >Reporter: Michael Kjellman > Labels: lhf > > We should remove the following warning on C* startup that OpenJDK is not > recommended. Now that with JDK8 OpenJDK is the reference JVM implementation > and things are much more stable -- and that all of our tests run on OpenJDK > builds due to the Oracle JDK license, this warning isn't helpful and is > actually wrong and we should remove it to prevent any user confusion. > WARN [main] 2017-11-28 19:39:08,446 StartupChecks.java:202 - OpenJDK is not > recommended. Please upgrade to the newest Oracle Java release -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13916) Remove OpenJDK log warning
[ https://issues.apache.org/jira/browse/CASSANDRA-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185919#comment-16185919 ] Eric Evans commented on CASSANDRA-13916: +1 This warning just creates confusion, AFAICT > Remove OpenJDK log warning > -- > > Key: CASSANDRA-13916 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13916 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Anthony Grasso >Priority: Minor > > The following warning message will appear in the logs when using OpenJDK > {noformat} > WARN [main] ... OpenJDK is not recommended. Please upgrade to the newest > Oracle Java release > {noformat} > The above warning dates back to when OpenJDK 6 was released and there were > some issues in early releases of this version. The OpenJDK implementation is > used as a reference for the OracleJDK which means the implementations are > very close. In addition, most users have moved off Java 6 so we can probably > remove this warning message. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13754) BTree.Builder memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164860#comment-16164860 ] Eric Evans commented on CASSANDRA-13754: {quote} I think it might still be the same issue. The threads you mentioned are all created by the SEPWorker as well, as you can also see in your screenshot where your FastThreadLocalThread has a reference to an instance of that class. Now I'm not sure whether the actual content of your ThreadLocalMap s is the same as in my heap dump - in my case, the maps mostly held instances of BTree$Builder , which then had references to many byte[] arrays. Maybe you can check if this is the case for you as well? {quote} There are no instances of {{BTree}} here, (see new screenshot attached). {quote} Other than that, you could also try and see if the patches created by Robert Stupp alleviate your issue. {quote} Do you mean [bed7fa5|https://github.com/apache/cassandra/commit/bed7fa5ef8492d1ff3852cf299622a5ad4e0b621]? I haven't applied that, no, but it doesn't look like I'm leaking anything {{BTree}} so I don't think that would help. > BTree.Builder memory leak > - > > Key: CASSANDRA-13754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13754 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15 >Reporter: Eric Evans >Assignee: Robert Stupp > Fix For: 3.11.1 > > Attachments: Screenshot from 2017-09-11 16-54-43.png, Screenshot from > 2017-09-13 10-39-58.png > > > After a chronic bout of {{OutOfMemoryError}} in our development environment, > a heap analysis is showing that more than 10G of our 12G heaps are consumed > by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) > of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances. > Reverting > [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54] > fixes the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13754) BTree.Builder memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-13754: --- Attachment: Screenshot from 2017-09-13 10-39-58.png > BTree.Builder memory leak > - > > Key: CASSANDRA-13754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13754 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15 >Reporter: Eric Evans >Assignee: Robert Stupp > Fix For: 3.11.1 > > Attachments: Screenshot from 2017-09-11 16-54-43.png, Screenshot from > 2017-09-13 10-39-58.png > > > After a chronic bout of {{OutOfMemoryError}} in our development environment, > a heap analysis is showing that more than 10G of our 12G heaps are consumed > by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) > of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances. > Reverting > [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54] > fixes the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13754) BTree.Builder memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-13754: --- Attachment: Screenshot from 2017-09-11 16-54-43.png > BTree.Builder memory leak > - > > Key: CASSANDRA-13754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13754 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15 >Reporter: Eric Evans >Assignee: Robert Stupp > Fix For: 3.11.1 > > Attachments: Screenshot from 2017-09-11 16-54-43.png > > > After a chronic bout of {{OutOfMemoryError}} in our development environment, > a heap analysis is showing that more than 10G of our 12G heaps are consumed > by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) > of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances. > Reverting > [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54] > fixes the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Reopened] (CASSANDRA-13754) BTree.Builder memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans reopened CASSANDRA-13754: I apologize for chiming in here so late, but I'm not sure this addresses what I was seeing. In my dumps, all of the heap is tied up in the {{ThreadLocalMap}} s of instances of {{FastThreadLocalThread}} for _Native-Transport-Requests_, _RequestResponseStage_, _ReadStage_, etc; I think what I was seeing is different than [~markusdlugi]. See the attached screenshot of the dominator tree view in MemoryAnalyzer. I can make a dump available, but be warned, it is 12G in size. > BTree.Builder memory leak > - > > Key: CASSANDRA-13754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13754 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15 >Reporter: Eric Evans >Assignee: Robert Stupp > Fix For: 3.11.1 > > > After a chronic bout of {{OutOfMemoryError}} in our development environment, > a heap analysis is showing that more than 10G of our 12G heaps are consumed > by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) > of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances. > Reverting > [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54] > fixes the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13754) FastThreadLocal leaks memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-13754: --- Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15 (was: OpenJDK 8u141-b15) > FastThreadLocal leaks memory > > > Key: CASSANDRA-13754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13754 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15 >Reporter: Eric Evans > Fix For: 3.11.1 > > > After a chronic bout of {{OutOfMemoryError}} in our development environment, > a heap analysis is showing that more than 10G of our 12G heaps are consumed > by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) > of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances. > Reverting > [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54] > fixes the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13754) FastThreadLocal leaks memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122390#comment-16122390 ] Eric Evans commented on CASSANDRA-13754: Cassandra 3.11.0, Netty 4.0.44.Final > FastThreadLocal leaks memory > > > Key: CASSANDRA-13754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13754 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: OpenJDK 8u141-b15 >Reporter: Eric Evans > Fix For: 3.11.1 > > > After a chronic bout of {{OutOfMemoryError}} in our development environment, > a heap analysis is showing that more than 10G of our 12G heaps are consumed > by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) > of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances. > Reverting > [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54] > fixes the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13754) FastThreadLocal leaks memory
Eric Evans created CASSANDRA-13754: -- Summary: FastThreadLocal leaks memory Key: CASSANDRA-13754 URL: https://issues.apache.org/jira/browse/CASSANDRA-13754 Project: Cassandra Issue Type: Bug Components: Core Environment: OpenJDK 8u141-b15 Reporter: Eric Evans Fix For: 3.11.1 After a chronic bout of {{OutOfMemoryError}} in our development environment, a heap analysis is showing that more than 10G of our 12G heaps are consumed by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances. Reverting [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54] fixes the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13006) Disable automatic heap dumps on OOM error
[ https://issues.apache.org/jira/browse/CASSANDRA-13006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118421#comment-16118421 ] Eric Evans commented on CASSANDRA-13006: I think this behavior (invoking {{jmap}} on OOM) is a pretty serious violation to the Element of Least-surprise. We already provide mechanisms for passing arguments to the JVM, and TTBMK, all of them provide some means for dropping a heap dump on out-of-memory. It definitely caught me be surprise. We carried over {{-XX:+HeapDumpOnOutOfMemoryError}} from our 2.2.x environment, only to have Cassandra and the JVM racing to create a dump of the same name. Additionally, something about all of this is buggy, because on more than one occasion we've had Cassandra fork-bombing {{jmap}} processes {noformat} ● cassandra-b.service - distributed storage system for structured data Loaded: loaded (/lib/systemd/system/cassandra-b.service; static) Active: active (running) since Sat 2017-08-05 22:32:07 UTC; 23h ago Main PID: 25025 (java) CGroup: /system.slice/cassandra-b.service ├─ 9213 jmap -histo 25025 ├─ 9214 jmap -histo 25025 ├─ 9284 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9285 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9388 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9453 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9519 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9520 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9733 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9735 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9736 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14835 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14836 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14837 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14839 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14841 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14844 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18932 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18933 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18934 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18935 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18936 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18937 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18938 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18939 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18940 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18942 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18943 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18944 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18945 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 [ ... ] {noformat} IMO, the sanest strategy here would be to leave the creation of heap dumps to the JVM. > Disable automatic heap dumps on OOM error > - > > Key: CASSANDRA-13006 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13006 > Project: Cassandra > Issue Type: Bug > Components: Configuration >Reporter: anmols >Assignee: Benjamin Lerer >Priority: Minor > Fix For: 3.0.9 > > Attachments: 13006-3.0.9.txt > > > With CASSANDRA-9861, a change was added to enable collecting heap dumps by > default if the process encountered an OOM error. These heap dumps are stored > in the Apache Cassandra home directory unless configured otherwise (see > [Cassandra Support > Do
[jira] [Commented] (CASSANDRA-13544) Exceptions encountered for concurrent range deletes with mixed cluster keys
[ https://issues.apache.org/jira/browse/CASSANDRA-13544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16031597#comment-16031597 ] Eric Evans commented on CASSANDRA-13544: {quote} Would you have an easy to try the tip of 3.11? I would need to look more closely to say more but we did fixed genuine range tombstone bugs recently-ish, so checking it's not fixed would be really nice. {quote} OK, I've tested 3.11 (5a860a7) and am not seeing any exceptions. > Exceptions encountered for concurrent range deletes with mixed cluster keys > --- > > Key: CASSANDRA-13544 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13544 > Project: Cassandra > Issue Type: Bug > Components: Core, Local Write-Read Paths > Environment: Cassandra 3.7, Debian Linux >Reporter: Eric Evans > > Using a schema that looks something like... > {code:sql} > CREATE TABLE data ( > key text, > rev int, > tid timeuuid, > value blob, > PRIMARY KEY (key, rev, tid) > ) WITH CLUSTERING ORDER BY (rev DESC, tid DESC) > {code} > ...we are performing range deletes using inequality operators on both {{rev}} > and {{tid}} ({{WHERE key = ? AND rev < ?}} and {{WHERE key = ? AND rev = ? > AND tid < ?}}). These range deletes are interleaved with normal writes > probabilistically, and (apparently) when two such range deletes occur > concurrently, the following exceptions result. > {noformat} > ERROR [SharedPool-Worker-18] 2017-05-19 17:30:22,426 Message.java:611 - > Unexpected exception during request; channel = [id: 0x793a853b, > L:/10.64.0.36:9042 - R:/10.64.32.112:550 > 48] > java.lang.AssertionError: null > at > org.apache.cassandra.db.ClusteringBoundOrBoundary.(ClusteringBoundOrBoundary.java:31) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.ClusteringBoundary.(ClusteringBoundary.java:15) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.ClusteringBoundOrBoundary.inclusiveCloseExclusiveOpen(ClusteringBoundOrBoundary.java:78) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.RangeTombstoneBoundaryMarker.inclusiveCloseExclusiveOpen(RangeTombstoneBoundaryMarker.java:54) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.RangeTombstoneMarker$Merger.merge(RangeTombstoneMarker.java:139) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:521) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:478) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:460) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:320) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:113) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:30) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:53) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:23) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:6) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:735) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:410) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apa
[jira] [Commented] (CASSANDRA-13544) Exceptions encountered for concurrent range deletes with mixed cluster keys
[ https://issues.apache.org/jira/browse/CASSANDRA-13544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025446#comment-16025446 ] Eric Evans commented on CASSANDRA-13544: I can confirm this is still present in 3.10 (see exception text below); I'll give 3.11 a try and report back {noformat} WARN [MutationStage-5] 2017-05-25 19:30:41,605 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[MutationStage-5,5,main]: {} java.lang.AssertionError: null at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:536) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:217) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.MutableDeletionInfo.add(MutableDeletionInfo.java:141) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:143) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.Memtable.put(Memtable.java:284) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1316) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:618) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:425) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:222) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:68) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) ~[apache-cassandra-3.10.jar:3.10] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.10.jar:3.10] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.10.jar:3.10] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131] {noformat} > Exceptions encountered for concurrent range deletes with mixed cluster keys > --- > > Key: CASSANDRA-13544 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13544 > Project: Cassandra > Issue Type: Bug > Components: Core, Local Write-Read Paths > Environment: Cassandra 3.7, Debian Linux >Reporter: Eric Evans > > Using a schema that looks something like... > {code:sql} > CREATE TABLE data ( > key text, > rev int, > tid timeuuid, > value blob, > PRIMARY KEY (key, rev, tid) > ) WITH CLUSTERING ORDER BY (rev DESC, tid DESC) > {code} > ...we are performing range deletes using inequality operators on both {{rev}} > and {{tid}} ({{WHERE key = ? AND rev < ?}} and {{WHERE key = ? AND rev = ? > AND tid < ?}}). These range deletes are interleaved with normal writes > probabilistically, and (apparently) when two such range deletes occur > concurrently, the following exceptions result. > {noformat} > ERROR [SharedPool-Worker-18] 2017-05-19 17:30:22,426 Message.java:611 - > Unexpected exception during request; channel = [id: 0x793a853b, > L:/10.64.0.36:9042 - R:/10.64.32.112:550 > 48] > java.lang.AssertionError: null > at > org.apache.cassandra.db.ClusteringBoundOrBoundary.(ClusteringBoundOrBoundary.java:31) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.ClusteringBoundary.(ClusteringBoundary.java:15) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.ClusteringBoundOrBoundary.inclusiveCloseExclusiveOpen(ClusteringBoundOrBoundary.java:78) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.RangeTombstoneBoundaryMarker.inclusiveCloseExclusiveOpen(RangeTombstoneBoundaryMarker.java:54) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.RangeTombstoneMarker$Merger.merge(RangeTombstoneMarker.java:139) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:521) > ~[apache-cassandra-3.7.3.jar:3.7.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:478) > ~[apache-cassandra-3.7.
[jira] [Created] (CASSANDRA-13544) Exceptions encountered for concurrent range deletes with mixed cluster keys
Eric Evans created CASSANDRA-13544: -- Summary: Exceptions encountered for concurrent range deletes with mixed cluster keys Key: CASSANDRA-13544 URL: https://issues.apache.org/jira/browse/CASSANDRA-13544 Project: Cassandra Issue Type: Bug Components: Core, Local Write-Read Paths Environment: Cassandra 3.7, Debian Linux Reporter: Eric Evans Using a schema that looks something like... {code:sql} CREATE TABLE data ( key text, rev int, tid timeuuid, value blob, PRIMARY KEY (key, rev, tid) ) WITH CLUSTERING ORDER BY (rev DESC, tid DESC) {code} ...we are performing range deletes using inequality operators on both {{rev}} and {{tid}} ({{WHERE key = ? AND rev < ?}} and {{WHERE key = ? AND rev = ? AND tid < ?}}). These range deletes are interleaved with normal writes probabilistically, and (apparently) when two such range deletes occur concurrently, the following exceptions result. {noformat} ERROR [SharedPool-Worker-18] 2017-05-19 17:30:22,426 Message.java:611 - Unexpected exception during request; channel = [id: 0x793a853b, L:/10.64.0.36:9042 - R:/10.64.32.112:550 48] java.lang.AssertionError: null at org.apache.cassandra.db.ClusteringBoundOrBoundary.(ClusteringBoundOrBoundary.java:31) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.ClusteringBoundary.(ClusteringBoundary.java:15) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.ClusteringBoundOrBoundary.inclusiveCloseExclusiveOpen(ClusteringBoundOrBoundary.java:78) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.rows.RangeTombstoneBoundaryMarker.inclusiveCloseExclusiveOpen(RangeTombstoneBoundaryMarker.java:54) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.rows.RangeTombstoneMarker$Merger.merge(RangeTombstoneMarker.java:139) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:521) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:478) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:460) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:320) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:113) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:30) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:53) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:23) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:6) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:735) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:410) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:363) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:237) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:78) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:486) ~[apache-cassandra-3.7.3.jar:3.7.3] at org.apache.cassandra.cql3.QueryProcessor.proce
[jira] [Created] (CASSANDRA-13245) Unable to match traces to queries
Eric Evans created CASSANDRA-13245: -- Summary: Unable to match traces to queries Key: CASSANDRA-13245 URL: https://issues.apache.org/jira/browse/CASSANDRA-13245 Project: Cassandra Issue Type: Bug Components: Core Reporter: Eric Evans Tracing queries node-wide ala {{nodetool settraceprobability}} is of limited utility when you are using prepared statements; I cannot find any way of associating a trace session to an application query (it's not even possible to make out the keyspace name). https://gist.github.com/eevans/81650261c2b1b5b99f83112865fef24b -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12698) add json/yaml format option to nodetool status
[ https://issues.apache.org/jira/browse/CASSANDRA-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15523500#comment-15523500 ] Eric Evans commented on CASSANDRA-12698: [~shoshii]: You might also be interested in https://github.com/eevans/creole. > add json/yaml format option to nodetool status > -- > > Key: CASSANDRA-12698 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12698 > Project: Cassandra > Issue Type: Improvement >Reporter: Shogo Hoshii >Assignee: Shogo Hoshii > Attachments: ntstatus_json.patch, sample.json, sample.yaml > > > Hello, > This patch enables nodetool status to be output in json/yaml format. > I think this format could be useful interface for tools that operate or > deploy cassandra. > The format could free tools from parsing the result in their own way. > It would be great if someone would review this patch. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12558) Consider adding License, Sponsorship, Thanks, and Security links to site navigation
Eric Evans created CASSANDRA-12558: -- Summary: Consider adding License, Sponsorship, Thanks, and Security links to site navigation Key: CASSANDRA-12558 URL: https://issues.apache.org/jira/browse/CASSANDRA-12558 Project: Cassandra Issue Type: Wish Components: Documentation and Website Reporter: Eric Evans Priority: Minor [The Apache Project Branding Requirements|http://www.apache.org/foundation/marks/pmcs.html#navigation] state that our website navigation should include License, Sponsorship, Thanks, and Security links. By my reading, the use of the word _should_ falls short of making this a hard requirement, but I can't think of a good reason not to include these. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased
[ https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439541#comment-15439541 ] Eric Evans commented on CASSANDRA-11752: FWIW, I tested this patch, and compared the plots to those created from using [ExponentiallyDecayingReservoir|https://github.com/dropwizard/metrics/blob/3.1-maintenance/metrics-core/src/main/java/com/codahale/metrics/ExponentiallyDecayingReservoir.java] Results can be seen here: https://phabricator.wikimedia.org/T137474#2584099 > histograms/metrics in 2.2 do not appear recency biased > -- > > Key: CASSANDRA-11752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11752 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Chris Burroughs >Assignee: Per Otterström > Labels: metrics > Fix For: 2.2.8, 3.0.9, 3.8 > > Attachments: 11752-2.2-v2.txt, 11752-2.2.txt, boost-metrics.png, > c-jconsole-comparison.png, c-metrics.png, default-histogram.png, > server-patch-v2.png > > > In addition to upgrading to metrics3, CASSANDRA-5657 switched to using a > custom histogram implementation. After upgrading to Cassandra 2.2 > histograms/timer metrics are not suspiciously flat. To be useful for > graphing and alerting metrics need to be biased towards recent events. > I have attached images that I think illustrate this. > * The first two are a comparison between latency observed by a C* 2.2 (us) > cluster shoring very flat lines and a client (using metrics 2.2.0, ms) > showing server performance problems. We can't rule out with total certainty > that something else isn't the cause (that's why we measure from both the > client & server) but they very rarely disagree. > * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 > cluster over several minutes. Not a single digit changed on the 2.2 cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12281) Gossip blocks on startup when another node is bootstrapping
Eric Evans created CASSANDRA-12281: -- Summary: Gossip blocks on startup when another node is bootstrapping Key: CASSANDRA-12281 URL: https://issues.apache.org/jira/browse/CASSANDRA-12281 Project: Cassandra Issue Type: Bug Components: Core Reporter: Eric Evans Priority: Minor Attachments: restbase1015-a_jstack.txt In our cluster, normal node startup times (after a drain on shutdown) are less than 1 minute. However, when another node in the cluster is bootstrapping, the same node startup takes nearly 30 minutes to complete, the apparent result of gossip blocking on pending range calculations. {noformat} $ nodetool-a tpstats Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 1840 0 0 ReadStage 0 0 2350 0 0 RequestResponseStage 0 0 53 0 0 ReadRepairStage 0 0 1 0 0 CounterMutationStage 0 0 0 0 0 HintedHandoff 0 0 44 0 0 MiscStage 0 0 0 0 0 CompactionExecutor3 3395 0 0 MemtableReclaimMemory 0 0 30 0 0 PendingRangeCalculator1 2 29 0 0 GossipStage 1 5602164 0 0 MigrationStage0 0 0 0 0 MemtablePostFlush 0 0111 0 0 ValidationExecutor0 0 0 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 30 0 0 InternalResponseStage 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 CacheCleanupExecutor 0 0 0 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 {noformat} A full thread dump is attached, but the relevant bit seems to be here: {noformat} [ ... ] "GossipStage:1" #1801 daemon prio=5 os_prio=0 tid=0x7fe4cd54b000 nid=0xea9 waiting on condition [0x7fddcf883000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0004c1e922c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) at org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:174) at org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:160) at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2023) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1682) at org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1182) at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1165) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1128) at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE
[jira] [Updated] (CASSANDRA-12063) Brotli storage compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-12063: --- Fix Version/s: 2.2.x > Brotli storage compression > -- > > Key: CASSANDRA-12063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12063 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Eric Evans >Assignee: Eric Evans >Priority: Minor > Fix For: 2.2.x > > > Brotli is a compression algorithm based on a modern variant of the LZ77 > algorithm, Huffman coding and 2nd order context modeling. It produces > smaller compressed sizes at costs comparable to deflate. > I have a working [ICompressor > implementation|https://github.com/eevans/cassandra-brotli] which has received > a fair amount of testing already. I'll follow up shortly with a Cassandra > changeset(s) for review. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12063) Brotli storage compression
Eric Evans created CASSANDRA-12063: -- Summary: Brotli storage compression Key: CASSANDRA-12063 URL: https://issues.apache.org/jira/browse/CASSANDRA-12063 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Brotli is a compression algorithm based on a modern variant of the LZ77 algorithm, Huffman coding and 2nd order context modeling. It produces smaller compressed sizes at costs comparable to deflate. I have a working [ICompressor implementation|https://github.com/eevans/cassandra-brotli] which has received a fair amount of testing already. I'll follow up shortly with a Cassandra changeset(s) for review. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11940) Look into better default file_cache_size for 2.2
[ https://issues.apache.org/jira/browse/CASSANDRA-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331973#comment-15331973 ] Eric Evans commented on CASSANDRA-11940: In the plot below, nodes cerium, praseodymium (worst. hostname. evar.), and xenon are under a workload consisting primarily of random reads. At ~11:00, {{disk_access_mode}} is changed from {{auto}} to {{mmap_index_only}} on cerium and praseodymium only. When this happens, disk read throughput drops dramatically, as does the number of major page faults generated by Cassandra (from ~280/s to ~15/s). Over the course of this run, I increased {{file_cache_size_in_mb}} from 512, to 768, 1024, and 2048, with no observable difference to disk read throughput, or the rate of major page faults. These machines are 12-way (w/ hyperthreading), 16G RAM, and rotational disks. They have 4G heaps (though raised to 6G for the {{file_cache_size_in_mb: 2048}} test) !abnormal disk read throughput.png|width=960! {noformat} CREATE TABLE "local_group_wikipedia_T_parsoid_html".data ( "_domain" text, key text, rev int, tid timeuuid, "_del" timeuuid, "content-location" text, "content-sha256" blob, "content-type" text, "latestTid" timeuuid, tags set, value blob, PRIMARY KEY (("_domain", key), rev, tid) ) WITH CLUSTERING ORDER BY (rev DESC, tid DESC) AND bloom_filter_fp_chance = 0.1 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'tombstone_threshold': '0.02', 'unchecked_tombstone_compaction': 'true', 'base_time_seconds': '45', 'class': 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} AND compression = {'chunk_length_kb': '512', 'sstable_compression': 'org.apache.cassandra.io.compress.DeflateCompressor'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 86400 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; {noformat} > Look into better default file_cache_size for 2.2 > > > Key: CASSANDRA-11940 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11940 > Project: Cassandra > Issue Type: Improvement >Reporter: T Jake Luciani >Assignee: T Jake Luciani > Fix For: 2.2.x > > Attachments: abnormal disk read throughput.png > > > CASSANDRA-8464 added support for mmapped decompression where in version <= > 2.1 the reads were all decompressed in standard heap buffers. > Since the usage of the file_cache is based solely on the buffer capacity we > should/can make this much larger in 2.2 when the disk access mode is mmap. > The downside of this cache being too small is made worse by 8464 since the > buffers are mmapped/unmapped causing explicit page faults. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11940) Look into better default file_cache_size for 2.2
[ https://issues.apache.org/jira/browse/CASSANDRA-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-11940: --- Attachment: abnormal disk read throughput.png > Look into better default file_cache_size for 2.2 > > > Key: CASSANDRA-11940 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11940 > Project: Cassandra > Issue Type: Improvement >Reporter: T Jake Luciani >Assignee: T Jake Luciani > Fix For: 2.2.x > > Attachments: abnormal disk read throughput.png > > > CASSANDRA-8464 added support for mmapped decompression where in version <= > 2.1 the reads were all decompressed in standard heap buffers. > Since the usage of the file_cache is based solely on the buffer capacity we > should/can make this much larger in 2.2 when the disk access mode is mmap. > The downside of this cache being too small is made worse by 8464 since the > buffers are mmapped/unmapped causing explicit page faults. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased
[ https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327891#comment-15327891 ] Eric Evans commented on CASSANDRA-11752: Just adding a Me Too to the list of those looking for a solution to this; For those limited to something like Grafana/Graphite to plot these percentiles, the status quo is pretty unhelpful, (I was forced to patch my production systems to use {{com.codahale.metrics.ExponentiallyDecayingResvoir}} in the interim). > histograms/metrics in 2.2 do not appear recency biased > -- > > Key: CASSANDRA-11752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11752 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Chris Burroughs > Labels: metrics > Attachments: boost-metrics.png, c-jconsole-comparison.png, > c-metrics.png, default-histogram.png > > > In addition to upgrading to metrics3, CASSANDRA-5657 switched to using a > custom histogram implementation. After upgrading to Cassandra 2.2 > histograms/timer metrics are not suspiciously flat. To be useful for > graphing and alerting metrics need to be biased towards recent events. > I have attached images that I think illustrate this. > * The first two are a comparison between latency observed by a C* 2.2 (us) > cluster shoring very flat lines and a client (using metrics 2.2.0, ms) > showing server performance problems. We can't rule out with total certainty > that something else isn't the cause (that's why we measure from both the > client & server) but they very rarely disagree. > * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 > cluster over several minutes. Not a single digit changed on the 2.2 cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11931) IllegalStateException thrown fetching metrics histograms
[ https://issues.apache.org/jira/browse/CASSANDRA-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314491#comment-15314491 ] Eric Evans commented on CASSANDRA-11931: Indeed, it is the same metric, {{ColUpdateTimeDeltaHistogram}}. > IllegalStateException thrown fetching metrics histograms > > > Key: CASSANDRA-11931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11931 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Eric Evans > Fix For: 2.2.x > > > After upgrading to 2.2.6 (from 2.1.13), I'm seeing the following exception > while collecting metrics. > {noformat} > Caused by: java.lang.IllegalStateException: Unable to compute when histogram > overflowed > at > org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:199) > ~[apache-cassandra-2.2.6.jar:2.2.6] > at > org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:85) > ~[na:na] > at com.codahale.metrics.Snapshot.getMedian(Snapshot.java:38) ~[na:na] > at > org.apache.cassandra.metrics.CassandraMetricsRegistry$JmxHistogram.get50thPercentile(CassandraMetricsRegistry.java:218) > ~[apache-cassandra-2.2.6.jar:2.2.6] > at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) ~[na:na] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[na:1.8.0_91] > at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91] > at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) ~[na:1.8.0_91] > at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) ~[na:na] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[na:1.8.0_91] > at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91] > at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) > ~[na:1.8.0_91] > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) > ~[na:1.8.0_91] > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) > ~[na:1.8.0_91] > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) > ~[na:1.8.0_91] > at > com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) > ~[na:1.8.0_91] > at > com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) > ~[na:1.8.0_91] > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) > ~[na:1.8.0_91] > at > com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) > ~[na:1.8.0_91] > at > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445) > ~[na:1.8.0_91] > at > javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) > ~[na:1.8.0_91] > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309) > ~[na:1.8.0_91] > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401) > ~[na:1.8.0_91] > at > javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639) > ~[na:1.8.0_91] > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) ~[na:na] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[na:1.8.0_91] > at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91] > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:324) > ~[na:1.8.0_91] > at sun.rmi.transport.Transport$1.run(Transport.java:200) ~[na:1.8.0_91] > at sun.rmi.transport.Transport$1.run(Transport.java:197) ~[na:1.8.0_91] > at java.security.AccessController.doPrivileged(Native Method) > ~[na:1.8.0_91] > at sun.rmi.transport.Transport.serviceCall(Transport.java:196) > ~[na:1.8.0_91] > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) > ~[na:1.8.0_91] > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) > ~[na:1.8.0_91] > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683) > ~[na:1.8.0_91] > at java.security.AccessController.doPrivileged(Native Method) > ~[na:1.8.0_91] > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) > ~[na:1.8.0_91] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_91] > at > java.util.co
[jira] [Created] (CASSANDRA-11931) IllegalStateException thrown fetching metrics histograms
Eric Evans created CASSANDRA-11931: -- Summary: IllegalStateException thrown fetching metrics histograms Key: CASSANDRA-11931 URL: https://issues.apache.org/jira/browse/CASSANDRA-11931 Project: Cassandra Issue Type: Bug Components: Core Reporter: Eric Evans After upgrading to 2.2.6 (from 2.1.13), I'm seeing the following exception while collecting metrics. {noformat} Caused by: java.lang.IllegalStateException: Unable to compute when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:199) ~[apache-cassandra-2.2.6.jar:2.2.6] at org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:85) ~[na:na] at com.codahale.metrics.Snapshot.getMedian(Snapshot.java:38) ~[na:na] at org.apache.cassandra.metrics.CassandraMetricsRegistry$JmxHistogram.get50thPercentile(CassandraMetricsRegistry.java:218) ~[apache-cassandra-2.2.6.jar:2.2.6] at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_91] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91] at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) ~[na:1.8.0_91] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_91] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91] at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) ~[na:1.8.0_91] at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) ~[na:1.8.0_91] at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) ~[na:1.8.0_91] at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) ~[na:1.8.0_91] at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) ~[na:1.8.0_91] at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) ~[na:1.8.0_91] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) ~[na:1.8.0_91] at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) ~[na:1.8.0_91] at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445) ~[na:1.8.0_91] at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) ~[na:1.8.0_91] at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309) ~[na:1.8.0_91] at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401) ~[na:1.8.0_91] at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639) ~[na:1.8.0_91] at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_91] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91] at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:324) ~[na:1.8.0_91] at sun.rmi.transport.Transport$1.run(Transport.java:200) ~[na:1.8.0_91] at sun.rmi.transport.Transport$1.run(Transport.java:197) ~[na:1.8.0_91] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_91] at sun.rmi.transport.Transport.serviceCall(Transport.java:196) ~[na:1.8.0_91] at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) ~[na:1.8.0_91] at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) ~[na:1.8.0_91] at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683) ~[na:1.8.0_91] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_91] at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) ~[na:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_91] at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_91] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11351) rethink stream throttling logic
[ https://issues.apache.org/jira/browse/CASSANDRA-11351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196326#comment-15196326 ] Eric Evans commented on CASSANDRA-11351: [~pauloricardomg] I think you are right, yes. > rethink stream throttling logic > --- > > Key: CASSANDRA-11351 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11351 > Project: Cassandra > Issue Type: Improvement >Reporter: Brandon Williams > > Currently, we throttle steaming from the outbound side, because throttling > from the inbound side is thought as not doable. This creates a problem > because the total stream throughput is based on the number of nodes involved, > so based on the operation to be performed it can vary. This creates > operational overhead, as the throttle has to be constantly adjusted. > I propose we flip this logic on its head, and instead limit the total inbound > throughput. How? It's simple: we ask. Given a total inbound throughput of > 200Mb, if a node is going to stream from 10 nodes, it would simply tell the > source nodes to only stream at 20Mb/s when asking for the stream, thereby > never going over the 200Mb inbound limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149333#comment-15149333 ] Eric Evans commented on CASSANDRA-9766: --- So if I'm understanding this correctly (and I'm probably not), increasing the receiving buffer would get more of the data over the wire before blocking on the read from buffer, decompression, etc (up to however much the buffer was increased by). Is that right? If so, that wouldn't really help much; That would seem to imply that processing the compressed data is the bottleneck, and that the blocking is (rightfully) applying back-pressure to the network-side. > Bootstrap outgoing streaming speeds are much slower than during repair > -- > > Key: CASSANDRA-9766 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9766 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.1.2. more details in the pdf attached >Reporter: Alexei K > Fix For: 2.1.x > > Attachments: problem.pdf > > > I have a cluster in Amazon cloud , its described in detail in the attachment. > What I've noticed is that we during bootstrap we never go above 12MB/sec > transmission speeds and also those speeds flat line almost like we're hitting > some sort of a limit ( this remains true for other tests that I've ran) > however during the repair we see much higher,variable sending rates. I've > provided network charts in the attachment as well . Is there an explanation > for this? Is something wrong with my configuration, or is it a possible bug? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149229#comment-15149229 ] Eric Evans commented on CASSANDRA-9766: --- I'm seeing something similar here; I get an eerily consistent 4.5MB/s _per stream_, (much less than the stream throughput limit, and the capability of the network). We have large partitions, large SSTables, and a mixture of 256k and 512k chunk lengths. [~yukim] what would be the best test of this, would https://gist.github.com/eevans/81f02849eab7634871c9 do? > Bootstrap outgoing streaming speeds are much slower than during repair > -- > > Key: CASSANDRA-9766 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9766 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.1.2. more details in the pdf attached >Reporter: Alexei K > Fix For: 2.1.x > > Attachments: problem.pdf > > > I have a cluster in Amazon cloud , its described in detail in the attachment. > What I've noticed is that we during bootstrap we never go above 12MB/sec > transmission speeds and also those speeds flat line almost like we're hitting > some sort of a limit ( this remains true for other tests that I've ran) > however during the repair we see much higher,variable sending rates. I've > provided network charts in the attachment as well . Is there an explanation > for this? Is something wrong with my configuration, or is it a possible bug? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11105) cassandra-stress tool - InvalidQueryException: Batch too large
[ https://issues.apache.org/jira/browse/CASSANDRA-11105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15128631#comment-15128631 ] Eric Evans commented on CASSANDRA-11105: FWIW, I'm seeing the same thing (2.1.12), [yaml gist here|https://gist.github.com/eevans/1babf3fab9206951d7e6]. When I run this config with {{n=1}}, I can see that 50 CQL rows are added, all with the same partition key, with two unique {{rev}} columns (25 each). > cassandra-stress tool - InvalidQueryException: Batch too large > -- > > Key: CASSANDRA-11105 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11105 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra 2.2.4, Java 8, CentOS 6.5 >Reporter: Ralf Steppacher > Attachments: batch_too_large.yaml > > > I am using Cassandra 2.2.4 and I am struggling to get the cassandra-stress > tool to work for my test scenario. I have followed the example on > http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema > to create a yaml file describing my test (attached). > I am collecting events per user id (text, partition key). Events have a > session type (text), event type (text), and creation time (timestamp) > (clustering keys, in that order). Plus some more attributes required for > rendering the events in a UI. For testing purposes I ended up with the > following column spec and insert distribution: > {noformat} > columnspec: > - name: created_at > cluster: uniform(10..1) > - name: event_type > size: uniform(5..10) > population: uniform(1..30) > cluster: uniform(1..30) > - name: session_type > size: fixed(5) > population: uniform(1..4) > cluster: uniform(1..4) > - name: user_id > size: fixed(15) > population: uniform(1..100) > - name: message > size: uniform(10..100) > population: uniform(1..100B) > insert: > partitions: fixed(1) > batchtype: UNLOGGED > select: fixed(1)/120 > {noformat} > Running stress tool for just the insert prints > {noformat} > Generating batches with [1..1] partitions and [0..1] rows (of [10..120] > total rows in the partitions) > {noformat} > and then immediately starts flooding me with > {{com.datastax.driver.core.exceptions.InvalidQueryException: Batch too > large}}. > Why I should be exceeding the {{batch_size_fail_threshold_in_kb: 50}} in the > {{cassandra.yaml}} I do not understand. My understanding is that the stress > tool should generate one row per batch. The size of a single row should not > exceed {{8+10*3+5*3+15*3+100*3 = 398 bytes}}. Assuming a worst case of all > text characters being 3 byte unicode characters. > This is how I start the attached user scenario: > {noformat} > [rsteppac@centos bin]$ ./cassandra-stress user > profile=../batch_too_large.yaml ops\(insert=1\) -log level=verbose > file=~/centos_event_by_patient_session_event_timestamp_insert_only.log -node > 10.211.55.8 > INFO 08:00:07 Did not find Netty's native epoll transport in the classpath, > defaulting to NIO. > INFO 08:00:08 Using data-center name 'datacenter1' for > DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct > datacenter name with DCAwareRoundRobinPolicy constructor) > INFO 08:00:08 New Cassandra host /10.211.55.8:9042 added > Connected to cluster: Titan_DEV > Datatacenter: datacenter1; Host: /10.211.55.8; Rack: rack1 > Created schema. Sleeping 1s for propagation. > Generating batches with [1..1] partitions and [0..1] rows (of [10..120] > total rows in the partitions) > com.datastax.driver.core.exceptions.InvalidQueryException: Batch too large > at > com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35) > at > com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:271) > at > com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:185) > at > com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:55) > at > org.apache.cassandra.stress.operations.userdefined.SchemaInsert$JavaDriverRun.run(SchemaInsert.java:87) > at > org.apache.cassandra.stress.Operation.timeWithRetry(Operation.java:159) > at > org.apache.cassandra.stress.operations.userdefined.SchemaInsert.run(SchemaInsert.java:119) > at > org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:309) > Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Batch > too large > at > com.datastax.driver.core.Responses$Error.asException(Responses.java:125) > at > com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:120) >
[jira] [Commented] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars
[ https://issues.apache.org/jira/browse/CASSANDRA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908248#comment-14908248 ] Eric Evans commented on CASSANDRA-8821: --- [~mshuler]: {{debian/init}}: This is all a throwback to when the init script used {{jsvc}}, and there is a bit more of it that can be excised; You can also remove the bits that source {{cassandra-env.sh}}, and the zero-length test of {{$JVM_OPTS}}. See: CASSANDRA-10251. {{conf/cassandra-env.\{sh,ps1\}}}: I don't think we should be promoting the use of {{JVM_EXTRA_OPTS}} from within {{cassandra-env.*}}. By which I mean, if you are editing one of those files then it's probably sanest to just edit {{JVM_OPTS}}, and leave {{JVM_EXTRA_OPTS}} as a way of injecting options from elsewhere. I suggest that we remove the commented out assignment, adjust the documenting comments accordingly, or remove them entirely. You can put a commented out {{JVM_EXTRA_OPTS}} assignment, and corresponding documentation in {{debian/default}}. Does this sounds reasonable? > Errors in JVM_OPTS and cassandra_parms environment vars > --- > > Key: CASSANDRA-8821 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8821 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04 LTS amd64 >Reporter: Terry Moschou >Assignee: Michael Shuler >Priority: Minor > Fix For: 2.1.x, 2.2.x > > Attachments: 8821_2.0.txt, 8821_2.1.txt > > > Repos: > deb http://www.apache.org/dist/cassandra/debian 21x main > deb-src http://www.apache.org/dist/cassandra/debian 21x main > The cassandra init script > /etc/init.d/cassandra > is sourcing the environment file > /etc/cassandra/cassandra-env.sh > twice. Once directly from the init script, and again inside > /usr/sbin/cassandra > The result is arguments in JVM_OPTS are duplicated. > Further the JVM opt > -XX:CMSWaitDuration=1 > is defined twice if jvm >= 1.7.60. > Also, for the environment variable CASSANDRA_CONF used in this context > -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler" > is undefined when > /etc/cassandra/cassandra-env.sh > is sourced from the init script. > Lastly the variable cassandra_storagedir is undefined in > /usr/sbin/cassandra > when used in this context > -Dcassandra.storagedir=$cassandra_storagedir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars
[ https://issues.apache.org/jira/browse/CASSANDRA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-8821: -- Reviewer: Eric Evans (was: Brandon Williams) > Errors in JVM_OPTS and cassandra_parms environment vars > --- > > Key: CASSANDRA-8821 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8821 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04 LTS amd64 >Reporter: Terry Moschou >Assignee: Michael Shuler >Priority: Minor > Fix For: 2.1.x, 2.2.x > > Attachments: 8821_2.0.txt, 8821_2.1.txt > > > Repos: > deb http://www.apache.org/dist/cassandra/debian 21x main > deb-src http://www.apache.org/dist/cassandra/debian 21x main > The cassandra init script > /etc/init.d/cassandra > is sourcing the environment file > /etc/cassandra/cassandra-env.sh > twice. Once directly from the init script, and again inside > /usr/sbin/cassandra > The result is arguments in JVM_OPTS are duplicated. > Further the JVM opt > -XX:CMSWaitDuration=1 > is defined twice if jvm >= 1.7.60. > Also, for the environment variable CASSANDRA_CONF used in this context > -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler" > is undefined when > /etc/cassandra/cassandra-env.sh > is sourced from the init script. > Lastly the variable cassandra_storagedir is undefined in > /usr/sbin/cassandra > when used in this context > -Dcassandra.storagedir=$cassandra_storagedir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10251) JVM_OPTS repetition when started from init script
[ https://issues.apache.org/jira/browse/CASSANDRA-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-10251: --- Reviewer: Michael Shuler (was: Brandon Williams) > JVM_OPTS repetition when started from init script > - > > Key: CASSANDRA-10251 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10251 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: Debian >Reporter: Eric Evans >Assignee: Eric Evans > Attachments: cassandra.init.patch > > > The Debian package init script sources {{cassandra-env.sh}}, and exports > {{JVM_OPTS}}, a throw back to when we used jsvc, and constructed the full > command with args. Now that we are using {{/usr/sbin/cassandra}}, which > sources {{cassandra-env.sh}} itself, this results in the contents of > {{JVM_OPTS}} appearing twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10251) JVM_OPTS repetition when started from init script
[ https://issues.apache.org/jira/browse/CASSANDRA-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-10251: --- Reviewer: Brandon Williams > JVM_OPTS repetition when started from init script > - > > Key: CASSANDRA-10251 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10251 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: Debian >Reporter: Eric Evans >Assignee: Eric Evans > Attachments: cassandra.init.patch > > > The Debian package init script sources {{cassandra-env.sh}}, and exports > {{JVM_OPTS}}, a throw back to when we used jsvc, and constructed the full > command with args. Now that we are using {{/usr/sbin/cassandra}}, which > sources {{cassandra-env.sh}} itself, this results in the contents of > {{JVM_OPTS}} appearing twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10251) JVM_OPTS repetition when started from init script
Eric Evans created CASSANDRA-10251: -- Summary: JVM_OPTS repetition when started from init script Key: CASSANDRA-10251 URL: https://issues.apache.org/jira/browse/CASSANDRA-10251 Project: Cassandra Issue Type: Bug Components: Packaging Environment: Debian Reporter: Eric Evans Assignee: Eric Evans Attachments: cassandra.init.patch The Debian package init script sources {{cassandra-env.sh}}, and exports {{JVM_OPTS}}, a throw back to when we used jsvc, and constructed the full command with args. Now that we are using {{/usr/sbin/cassandra}}, which sources {{cassandra-env.sh}} itself, this results in the contents of {{JVM_OPTS}} appearing twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697008#comment-14697008 ] Eric Evans commented on CASSANDRA-9625: --- What we eventually did was bang together https://github.com/wikimedia/cassandra-metrics-collector, which collects from JMX, and pushes to Graphite using metric names that are compatible with GraphiteReporter (i.e. you can leverage your existing graphs, etc). Hopefully this is useful to anyone else encountering this problem. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9624) unable to bootstrap; streaming fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646779#comment-14646779 ] Eric Evans commented on CASSANDRA-9624: --- I'm sorry Yuki, I didn't see this comment when you made it (not sure why). > unable to bootstrap; streaming fails with NullPointerException > -- > > Key: CASSANDRA-9624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9624 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: Yuki Morishita > Fix For: 2.1.x > > > When attempting to bootstrap a new node into a 2.1.3 cluster, the stream > source fails with a {{NullPointerException}}: > {noformat} > ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 > - [Stream #60e8c120- > 115f-11e5-9fee-] Streaming error occurred > java.lang.NullPointerException: null > at > org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] > INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 > StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] > Session with /10.xx.x.xx1 is complete > {noformat} > _Update (2015-06-26):_ > I can also reproduce this on 2.1.7, though without the NPE on the stream-from > side. > Stream source / existing node: > {noformat} > INFO [STREAM-IN-/10.64.32.178] 2015-06-26 06:48:53,060 > StreamResultFuture.java:180 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > Session with /10.64.32.178 is complete > INFO [STREAM-IN-/10.64.32.178] 2015-06-26 06:48:53,064 > StreamResultFuture.java:212 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > All sessions completed > {noformat} > Stream sink / bootstrapping node: > {noformat} > INFO [StreamReceiveTask:57] 2015-06-26 06:48:53,061 > StreamResultFuture.java:180 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > Session with /10.64.32.160 is complete > WARN [StreamReceiveTask:57] 2015-06-26 06:48:53,062 > StreamResultFuture.java:207 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > Stream failed > INFO [CompactionExecutor:2885] 2015-06-26 06:48:53,062 > ColumnFamilyStore.java:906 - Enqueuing flush of compactions_in_progress: 428 > (0%) on-heap, 379 (0%) off-heap > INFO [MemtableFlushWriter:959] 2015-06-26 06:48:53,063 Memtable.java:346 - > Writing Memtable-compactions_in_progress@1203013482(294 serialized bytes, 12 > ops, 0%/0% of on/off-heap limit) > ERROR [main] 2015-06-26 06:48:53,063 CassandraDaemon.java:541 - Exception > encountered during startup > java.lang.RuntimeException: Error during boostrap: Stream failed > at > org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1137) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:927) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:723) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:605) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378) > [apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:524) > [apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:613) > [apache-cassandra-2.1.7.jar:2.1.7] > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > ~[apache-cass
[jira] [Commented] (CASSANDRA-9624) unable to bootstrap; streaming fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646776#comment-14646776 ] Eric Evans commented on CASSANDRA-9624: --- Since we had previously bootstrapped with 2.1.3 when the cluster had less data and lower throughput, we scaled the cluster down to see if it would help, and in so doing, just managed to successfully bootstrap a new node. I don't even. > unable to bootstrap; streaming fails with NullPointerException > -- > > Key: CASSANDRA-9624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9624 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: Yuki Morishita > Fix For: 2.1.x > > > When attempting to bootstrap a new node into a 2.1.3 cluster, the stream > source fails with a {{NullPointerException}}: > {noformat} > ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 > - [Stream #60e8c120- > 115f-11e5-9fee-] Streaming error occurred > java.lang.NullPointerException: null > at > org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] > INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 > StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] > Session with /10.xx.x.xx1 is complete > {noformat} > _Update (2015-06-26):_ > I can also reproduce this on 2.1.7, though without the NPE on the stream-from > side. > Stream source / existing node: > {noformat} > INFO [STREAM-IN-/10.64.32.178] 2015-06-26 06:48:53,060 > StreamResultFuture.java:180 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > Session with /10.64.32.178 is complete > INFO [STREAM-IN-/10.64.32.178] 2015-06-26 06:48:53,064 > StreamResultFuture.java:212 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > All sessions completed > {noformat} > Stream sink / bootstrapping node: > {noformat} > INFO [StreamReceiveTask:57] 2015-06-26 06:48:53,061 > StreamResultFuture.java:180 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > Session with /10.64.32.160 is complete > WARN [StreamReceiveTask:57] 2015-06-26 06:48:53,062 > StreamResultFuture.java:207 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] > Stream failed > INFO [CompactionExecutor:2885] 2015-06-26 06:48:53,062 > ColumnFamilyStore.java:906 - Enqueuing flush of compactions_in_progress: 428 > (0%) on-heap, 379 (0%) off-heap > INFO [MemtableFlushWriter:959] 2015-06-26 06:48:53,063 Memtable.java:346 - > Writing Memtable-compactions_in_progress@1203013482(294 serialized bytes, 12 > ops, 0%/0% of on/off-heap limit) > ERROR [main] 2015-06-26 06:48:53,063 CassandraDaemon.java:541 - Exception > encountered during startup > java.lang.RuntimeException: Error during boostrap: Stream failed > at > org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1137) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:927) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:723) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:605) > ~[apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378) > [apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:524) > [apache-cassandra-2.1.7.jar:2.1.7] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:613) > [apache-cassandra-2.1.7.jar:2.1.7] > Caused by: org.apache.cassandra.streaming.StreamE
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608295#comment-14608295 ] Eric Evans commented on CASSANDRA-9625: --- Per a discussion on IRC, I set the interval to 120 seconds (double). Unfortunately, reporting still ceased. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Fix For: 2.1.x > > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606384#comment-14606384 ] Eric Evans edited comment on CASSANDRA-9625 at 6/29/15 8:58 PM: I'm not surprised you're unable to replicate, I can't replicate this in my staging environment (same software, fewer machines, less data, less traffic, etc). {quote} ...do you have any logs or can you tell me how many column families you have? {quote} The logs are completely normal, only the usual startup messages, nothing else. I locally patched the reporter on one machine and added copious logging statements, the last one to execute was at the beginning of this method: https://github.com/dropwizard/metrics/blob/v2.2.0/metrics-graphite/src/main/java/com/yammer/metrics/reporting/GraphiteReporter.java#L240 (it never makes it to the statement at the end). I have about 150 column families. {quote} Also are you verifying that graphite didn't just poop out? {quote} Yeah, we use Graphite quite a bit, there'd be no missing an outage there. was (Author: urandom): I'm not surprised you're unable to replicate, I can't replicate this in my staging environment (same software, fewer machines, less data, less traffic, etc). {quote} ...do you have any logs or can you tell me how many column families you have? {quote} The logs are completely normal, only the usual startup messages, nothing else. I locally patched the reporter on one machine and added a copious logging statements, the last one to execute was at the beginning of this method: https://github.com/dropwizard/metrics/blob/v2.2.0/metrics-graphite/src/main/java/com/yammer/metrics/reporting/GraphiteReporter.java#L240 (it never makes it to the statement at the end). I have about 150 column families. {quote} Also are you verifying that graphite didn't just poop out? {quote} Yeah, we use Graphite quite a bit, there'd be no missing an outage there. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Fix For: 2.1.x > > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606384#comment-14606384 ] Eric Evans commented on CASSANDRA-9625: --- I'm not surprised you're unable to replicate, I can't replicate this in my staging environment (same software, fewer machines, less data, less traffic, etc). {quote} ...do you have any logs or can you tell me how many column families you have? {quote} The logs are completely normal, only the usual startup messages, nothing else. I locally patched the reporter on one machine and added a copious logging statements, the last one to execute was at the beginning of this method: https://github.com/dropwizard/metrics/blob/v2.2.0/metrics-graphite/src/main/java/com/yammer/metrics/reporting/GraphiteReporter.java#L240 (it never makes it to the statement at the end). I have about 150 column families. {quote} Also are you verifying that graphite didn't just poop out? {quote} Yeah, we use Graphite quite a bit, there'd be no missing an outage there. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Fix For: 2.1.x > > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9624) unable to bootstrap; streaming fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-9624: -- Description: When attempting to bootstrap a new node into a 2.1.3 cluster, the stream source fails with a {{NullPointerException}}: {noformat} ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 - [Stream #60e8c120- 115f-11e5-9fee-] Streaming error occurred java.lang.NullPointerException: null at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] Session with /10.xx.x.xx1 is complete {noformat} _Update (2015-06-26):_ I can also reproduce this on 2.1.7, though without the NPE on the stream-from side. Stream source / existing node: {noformat} INFO [STREAM-IN-/10.64.32.178] 2015-06-26 06:48:53,060 StreamResultFuture.java:180 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] Session with /10.64.32.178 is complete INFO [STREAM-IN-/10.64.32.178] 2015-06-26 06:48:53,064 StreamResultFuture.java:212 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] All sessions completed {noformat} Stream sink / bootstrapping node: {noformat} INFO [StreamReceiveTask:57] 2015-06-26 06:48:53,061 StreamResultFuture.java:180 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] Session with /10.64.32.160 is complete WARN [StreamReceiveTask:57] 2015-06-26 06:48:53,062 StreamResultFuture.java:207 - [Stream #8bdeb1b0-1ad2-11e5-abd8-3fcfb96209d9] Stream failed INFO [CompactionExecutor:2885] 2015-06-26 06:48:53,062 ColumnFamilyStore.java:906 - Enqueuing flush of compactions_in_progress: 428 (0%) on-heap, 379 (0%) off-heap INFO [MemtableFlushWriter:959] 2015-06-26 06:48:53,063 Memtable.java:346 - Writing Memtable-compactions_in_progress@1203013482(294 serialized bytes, 12 ops, 0%/0% of on/off-heap limit) ERROR [main] 2015-06-26 06:48:53,063 CassandraDaemon.java:541 - Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1137) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:927) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:723) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:605) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378) [apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:524) [apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:613) [apache-cassandra-2.1.7.jar:2.1.7] Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) ~[apache-cassandra-2.1.7.jar:2.1.7] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na] at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) ~[guava-16.0.jar:na] at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) ~[guava-16.0.jar:na] at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) ~[guava-16.0.jar:na] at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) ~[guava-16.0.jar:na] at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208) ~[apache-cassandra-2.1.7.jar:2.1.7] a
[jira] [Updated] (CASSANDRA-9624) unable to bootstrap; streaming fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-9624: -- Reproduced In: 2.1.7, 2.1.3 (was: 2.1.3) > unable to bootstrap; streaming fails with NullPointerException > -- > > Key: CASSANDRA-9624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9624 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: Yuki Morishita > Fix For: 2.1.x > > > When attempting to bootstrap a new node into a 2.1.3 cluster, the stream > source fails with a {{NullPointerException}}: > {noformat} > ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 > - [Stream #60e8c120- > 115f-11e5-9fee-] Streaming error occurred > java.lang.NullPointerException: null > at > org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] > INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 > StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] > Session with /10.xx.x.xx1 is complete > {noformat} > _Update (2015-06-26):_ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9624) unable to bootstrap; streaming fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-9624: -- Description: When attempting to bootstrap a new node into a 2.1.3 cluster, the stream source fails with a {{NullPointerException}}: {noformat} ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 - [Stream #60e8c120- 115f-11e5-9fee-] Streaming error occurred java.lang.NullPointerException: null at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] Session with /10.xx.x.xx1 is complete {noformat} _Update (2015-06-26):_ was: When attempting to bootstrap a new node into a 2.1.3 cluster, the stream source fails with a {{NullPointerException}}: {noformat} ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 - [Stream #60e8c120- 115f-11e5-9fee-] Streaming error occurred java.lang.NullPointerException: null at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] Session with /10.xx.x.xx1 is complete {noformat} > unable to bootstrap; streaming fails with NullPointerException > -- > > Key: CASSANDRA-9624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9624 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: Yuki Morishita > Fix For: 2.1.x > > > When attempting to bootstrap a new node into a 2.1.3 cluster, the stream > source fails with a {{NullPointerException}}: > {noformat} > ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 > - [Stream #60e8c120- > 115f-11e5-9fee-] Streaming error occurred > java.lang.NullPointerException: null > at > org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] > INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 > StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] > Session with /10.xx.x.xx1 is co
[jira] [Updated] (CASSANDRA-9624) unable to bootstrap; streaming fails with NullPointerException
[ https://issues.apache.org/jira/browse/CASSANDRA-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-9624: -- Reproduced In: 2.1.7, 2.1.6, 2.1.3 (was: 2.1.3, 2.1.7) > unable to bootstrap; streaming fails with NullPointerException > -- > > Key: CASSANDRA-9624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9624 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: Yuki Morishita > Fix For: 2.1.x > > > When attempting to bootstrap a new node into a 2.1.3 cluster, the stream > source fails with a {{NullPointerException}}: > {noformat} > ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 > - [Stream #60e8c120- > 115f-11e5-9fee-] Streaming error occurred > java.lang.NullPointerException: null > at > org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) > ~[apache-cassandra-2.1.3.jar:2.1.3] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] > INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 > StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] > Session with /10.xx.x.xx1 is complete > {noformat} > _Update (2015-06-26):_ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-9625: -- Attachment: metrics.yaml > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-9625: -- Attachment: thread-dump.log > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans > Attachments: thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9625) GraphiteReporter not reporting
Eric Evans created CASSANDRA-9625: - Summary: GraphiteReporter not reporting Key: CASSANDRA-9625 URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 Project: Cassandra Issue Type: Bug Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 Reporter: Eric Evans When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops working. The usual startup is logged, and one batch of samples is sent, but the reporting interval comes and goes, and no other samples are ever sent. The logs are free from errors. Frustratingly, metrics reporting works in our smaller (staging) environment on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not on a 3 node (otherwise identical) staging cluster (maybe it takes a certain level of concurrency?). Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9624) unable to bootstrap; streaming fails with NullPointerException
Eric Evans created CASSANDRA-9624: - Summary: unable to bootstrap; streaming fails with NullPointerException Key: CASSANDRA-9624 URL: https://issues.apache.org/jira/browse/CASSANDRA-9624 Project: Cassandra Issue Type: Bug Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 Reporter: Eric Evans When attempting to bootstrap a new node into a 2.1.3 cluster, the stream source fails with a {{NullPointerException}}: {noformat} ERROR [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,264 StreamSession.java:477 - [Stream #60e8c120- 115f-11e5-9fee-] Streaming error occurred java.lang.NullPointerException: null at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:313) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:493) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:425) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] INFO [STREAM-IN-/10.xx.x.xxx] 2015-06-13 00:02:01,265 StreamResultFuture.java:180 - [Stream #60e8c120-115f-11e5-9fee-] Session with /10.xx.x.xx1 is complete {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9594) metrics reporter doesn't start until after a bootstrap
Eric Evans created CASSANDRA-9594: - Summary: metrics reporter doesn't start until after a bootstrap Key: CASSANDRA-9594 URL: https://issues.apache.org/jira/browse/CASSANDRA-9594 Project: Cassandra Issue Type: Bug Components: Core Reporter: Eric Evans Priority: Minor In {{o.a.c.service.CassandraDaemon#setup}}, the metrics reporter is started immediately after the invocation of {{o.a.c.service.StorageService#initServer}}, which for a bootstrapping node may block for a considerable period of time. If the metrics reporter is your only source of visibility, then you are blind until the bootstrap completes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9057) index validation fails for non-indexed column
Eric Evans created CASSANDRA-9057: - Summary: index validation fails for non-indexed column Key: CASSANDRA-9057 URL: https://issues.apache.org/jira/browse/CASSANDRA-9057 Project: Cassandra Issue Type: Bug Reporter: Eric Evans On 2.1.3, updates are failing with an InvalidRequestException when an unindexed column is greater than the maximum allowed for indexed entries. {noformat} ResponseError: Can't index column value of size 1483409 for index null on local_group_default_T_parsoid_html.data {noformat} In this case, the update _does_ include a 1483409 byte column value, but it is for a column that is not indexed, (the single indexed column is < 32 bytes), presumably this is why {{cfm.getColumnDefinition(cell.name()).getIndexName()}} returns {{null}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8784) -bash: bin/cassandra: No such file or directory
[ https://issues.apache.org/jira/browse/CASSANDRA-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317254#comment-14317254 ] Eric Evans commented on CASSANDRA-8784: --- {quote} -bash: bin/cassandra: No such file or directory {quote} This is {{bash}} indicating that {{bin/cassandra}}, is neither a file, nor a directory. Hope this helps. > -bash: bin/cassandra: No such file or directory > --- > > Key: CASSANDRA-8784 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8784 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: ubuntu 14.04 >Reporter: R Scott > Fix For: 2.1.2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6602) Compaction improvements to optimize time series data
[ https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1405#comment-1405 ] Eric Evans commented on CASSANDRA-6602: --- Has there been any further discussion about where this should land (2.0, 2.1, 3.0)? 3.0.0 seems too conservative for a compaction strategy implementation. [~Bj0rn]: do you have any results from your production testing you can share? > Compaction improvements to optimize time series data > > > Key: CASSANDRA-6602 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6602 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Tupshin Harper >Assignee: Björn Hegerfors > Labels: compaction, performance > Fix For: 3.0 > > Attachments: > cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt, > cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v2.txt, > cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v3.txt > > > There are some unique characteristics of many/most time series use cases that > both provide challenges, as well as provide unique opportunities for > optimizations. > One of the major challenges is in compaction. The existing compaction > strategies will tend to re-compact data on disk at least a few times over the > lifespan of each data point, greatly increasing the cpu and IO costs of that > write. > Compaction exists to > 1) ensure that there aren't too many files on disk > 2) ensure that data that should be contiguous (part of the same partition) is > laid out contiguously > 3) deleting data due to ttls or tombstones > The special characteristics of time series data allow us to optimize away all > three. > Time series data > 1) tends to be delivered in time order, with relatively constrained exceptions > 2) often has a pre-determined and fixed expiration date > 3) Never gets deleted prior to TTL > 4) Has relatively predictable ingestion rates > Note that I filed CASSANDRA-5561 and this ticket potentially replaces or > lowers the need for it. In that ticket, jbellis reasonably asks, how that > compaction strategy is better than disabling compaction. > Taking that to heart, here is a compaction-strategy-less approach that could > be extremely efficient for time-series use cases that follow the above > pattern. > (For context, I'm thinking of an example use case involving lots of streams > of time-series data with a 5GB per day ingestion rate, and a 1000 day > retention with TTL, resulting in an eventual steady state of 5TB per node) > 1) You have an extremely large memtable (preferably off heap, if/when doable) > for the table, and that memtable is sized to be able to hold a lengthy window > of time. A typical period might be one day. At the end of that period, you > flush the contents of the memtable to an sstable and move to the next one. > This is basically identical to current behaviour, but with thresholds > adjusted so that you can ensure flushing at predictable intervals. (Open > question is whether predictable intervals is actually necessary, or whether > just waiting until the huge memtable is nearly full is sufficient) > 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be > efficiently dropped once all of the columns have. (Another side note, it > might be valuable to have a modified version of CASSANDRA-3974 that doesn't > bother storing per-column TTL since it is required that all columns have the > same TTL) > 3) Be able to mark column families as read/write only (no explicit deletes), > so no tombstones. > 4) Optionally add back an additional type of delete that would delete all > data earlier than a particular timestamp, resulting in immediate dropping of > obsoleted sstables. > The result is that for in-order delivered data, Every cell will be laid out > optimally on disk on the first pass, and over the course of 1000 days and 5TB > of data, there will "only" be 1000 5GB sstables, so the number of filehandles > will be reasonable. > For exceptions (out-of-order delivery), most cases will be caught by the > extended (24 hour+) memtable flush times and merged correctly automatically. > For those that were slightly askew at flush time, or were delivered so far > out of order that they go in the wrong sstable, there is relatively low > overhead to reading from two sstables for a time slice, instead of one, and > that overhead would be incurred relatively rarely unless out-of-order > delivery was the common case, in which case, this strategy should not be used. > Another possible optimization to address out-of-order would be to maintain > more than one time-centric memtables in memory at a time (e.g. two 12 hour > ones), and then you always insert into whi
[jira] [Commented] (CASSANDRA-6602) Compaction improvements to optimize time series data
[ https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049237#comment-14049237 ] Eric Evans commented on CASSANDRA-6602: --- I like this; Let me know if I can be of help (review, testing, etc). > Compaction improvements to optimize time series data > > > Key: CASSANDRA-6602 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6602 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Tupshin Harper >Assignee: Björn Hegerfors > Labels: compaction, performance > Fix For: 3.0 > > Attachments: > cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt, > cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v2.txt, > cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v3.txt > > > There are some unique characteristics of many/most time series use cases that > both provide challenges, as well as provide unique opportunities for > optimizations. > One of the major challenges is in compaction. The existing compaction > strategies will tend to re-compact data on disk at least a few times over the > lifespan of each data point, greatly increasing the cpu and IO costs of that > write. > Compaction exists to > 1) ensure that there aren't too many files on disk > 2) ensure that data that should be contiguous (part of the same partition) is > laid out contiguously > 3) deleting data due to ttls or tombstones > The special characteristics of time series data allow us to optimize away all > three. > Time series data > 1) tends to be delivered in time order, with relatively constrained exceptions > 2) often has a pre-determined and fixed expiration date > 3) Never gets deleted prior to TTL > 4) Has relatively predictable ingestion rates > Note that I filed CASSANDRA-5561 and this ticket potentially replaces or > lowers the need for it. In that ticket, jbellis reasonably asks, how that > compaction strategy is better than disabling compaction. > Taking that to heart, here is a compaction-strategy-less approach that could > be extremely efficient for time-series use cases that follow the above > pattern. > (For context, I'm thinking of an example use case involving lots of streams > of time-series data with a 5GB per day ingestion rate, and a 1000 day > retention with TTL, resulting in an eventual steady state of 5TB per node) > 1) You have an extremely large memtable (preferably off heap, if/when doable) > for the table, and that memtable is sized to be able to hold a lengthy window > of time. A typical period might be one day. At the end of that period, you > flush the contents of the memtable to an sstable and move to the next one. > This is basically identical to current behaviour, but with thresholds > adjusted so that you can ensure flushing at predictable intervals. (Open > question is whether predictable intervals is actually necessary, or whether > just waiting until the huge memtable is nearly full is sufficient) > 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be > efficiently dropped once all of the columns have. (Another side note, it > might be valuable to have a modified version of CASSANDRA-3974 that doesn't > bother storing per-column TTL since it is required that all columns have the > same TTL) > 3) Be able to mark column families as read/write only (no explicit deletes), > so no tombstones. > 4) Optionally add back an additional type of delete that would delete all > data earlier than a particular timestamp, resulting in immediate dropping of > obsoleted sstables. > The result is that for in-order delivered data, Every cell will be laid out > optimally on disk on the first pass, and over the course of 1000 days and 5TB > of data, there will "only" be 1000 5GB sstables, so the number of filehandles > will be reasonable. > For exceptions (out-of-order delivery), most cases will be caught by the > extended (24 hour+) memtable flush times and merged correctly automatically. > For those that were slightly askew at flush time, or were delivered so far > out of order that they go in the wrong sstable, there is relatively low > overhead to reading from two sstables for a time slice, instead of one, and > that overhead would be incurred relatively rarely unless out-of-order > delivery was the common case, in which case, this strategy should not be used. > Another possible optimization to address out-of-order would be to maintain > more than one time-centric memtables in memory at a time (e.g. two 12 hour > ones), and then you always insert into whichever one of the two "owns" the > appropriate range of time. By delaying flushing the ahead one until we are > ready to roll writes over to a third one, we are able
[jira] [Updated] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6311: -- Fix Version/s: (was: 2.0.4) 2.0.5 > Add CqlRecordReader to take advantage of native CQL pagination > -- > > Key: CASSANDRA-6311 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 > Project: Cassandra > Issue Type: New Feature > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.5 > > Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, > 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt > > > Since the latest Cql pagination is done and it should be more efficient, so > we need update CqlPagingRecordReader to use it instead of the custom thrift > paging. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (CASSANDRA-6373) describe_ring hangs with hsha thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6373: -- Fix Version/s: (was: 2.0.4) 2.0.5 > describe_ring hangs with hsha thrift server > --- > > Key: CASSANDRA-6373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6373 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey >Assignee: Pavel Yaskevich > Fix For: 2.0.5 > > Attachments: describe_ring_failure.patch, jstack.txt, jstack2.txt > > > There is a strange bug with the thrift hsha server in 2.0 (we switched to > lmax disruptor server). > The bug is that the first call to describe_ring from one connection will hang > indefinitely when the client is not connecting from localhost (or it at least > looks like the client is not on the same host). Additionally the cluster must > be using vnodes. When connecting from localhost the first call will work as > expected. And in either case subsequent calls from the same connection will > work as expected. According to git bisect the bad commit is the switch to the > lmax disruptor server: > https://github.com/apache/cassandra/commit/98eec0a223251ecd8fec7ecc9e46b05497d631c6 > I've attached the patch I used to reproduce the error in the unit tests. The > command to reproduce is: > {noformat} > PYTHONPATH=test nosetests > --tests=system.test_thrift_server:TestMutations.test_describe_ring > {noformat} > I reproduced on ec2 and a single machine by having the server bind to the > private ip on ec2 and the client connect to the public ip (so it appears as > if the client is non local). I've also reproduced with two different vms > though. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (CASSANDRA-6210) Repair hangs when a new datacenter is added to a cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6210: -- Fix Version/s: (was: 2.0.4) 2.0.5 > Repair hangs when a new datacenter is added to a cluster > > > Key: CASSANDRA-6210 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6210 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Amazon Ec2 > 2 M1.large nodes >Reporter: Russell Alexander Spitzer >Assignee: Yuki Morishita > Fix For: 2.0.5 > > Attachments: RepairLogs.tar.gz > > > Attempting to add a new datacenter to a cluster seems to cause repair > operations to break. I've been reproducing this with 20~ node clusters but > can get it to reliably occur on 2 node setups. > {code} > ##Basic Steps to reproduce > #Node 1 is started using GossipingPropertyFileSnitch as dc1 > #Cassandra-stress is used to insert a minimal amount of data > $CASSANDRA_STRESS -t 100 -R > org.apache.cassandra.locator.NetworkTopologyStrategy --num-keys=1000 > --columns=10 --consistency-level=LOCAL_QUORUM --average-size-values - > -compaction-strategy='LeveledCompactionStrategy' -O dc1:1 > --operation=COUNTER_ADD > #Alter "Keyspace1" > ALTER KEYSPACE "Keyspace1" WITH replication = {'class': > 'NetworkTopologyStrategy', 'dc1': 1 , 'dc2': 1 }; > #Add node 2 using GossipingPropertyFileSnitch as dc2 > run repair on node 1 > run repair on node 2 > {code} > The repair task on node 1 never completes and while there are no exceptions > in the logs of node1, netstat reports the following repair tasks > {code} > Mode: NORMAL > Repair 4e71a250-36b4-11e3-bedc-1d1bb5c9abab > Repair 6c64ded0-36b4-11e3-bedc-1d1bb5c9abab > Read Repair Statistics: > Attempted: 0 > Mismatch (Blocking): 0 > Mismatch (Background): 0 > Pool NameActive Pending Completed > Commandsn/a 0 10239 > Responses n/a 0 3839 > {code} > Checking on node 2 we see the following exceptions > {code} > ERROR [STREAM-IN-/10.171.122.130] 2013-10-16 22:42:58,961 StreamSession.java > (line 410) [Stream #4e71a250-36b4-11e3-bedc-1d1bb5c9abab] Streaming error > occurred > java.lang.NullPointerException > at > org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:174) > at > org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) > at java.lang.Thread.run(Thread.java:724) > ... > ERROR [STREAM-IN-/10.171.122.130] 2013-10-16 22:43:49,214 StreamSession.java > (line 410) [Stream #6c64ded0-36b4-11e3-bedc-1d1bb5c9abab] Streaming error > occurred > java.lang.NullPointerException > at > org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:174) > at > org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) > at java.lang.Thread.run(Thread.java:724) > {code} > Netstats on node 2 reports > {code} > automaton@ip-10-171-15-234:~$ nodetool netstats > Mode: NORMAL > Repair 4e71a250-36b4-11e3-bedc-1d1bb5c9abab > Read Repair Statistics: > Attempted: 0 > Mismatch (Blocking): 0 > Mismatch (Background): 0 > Pool NameActive Pending Completed > Commandsn/a 0 2562 > Responses n/a 0 4284 > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (CASSANDRA-5633) CQL support for updating multiple rows in a partition using CAS
[ https://issues.apache.org/jira/browse/CASSANDRA-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-5633: -- Fix Version/s: (was: 2.0.4) 2.0.5 > CQL support for updating multiple rows in a partition using CAS > --- > > Key: CASSANDRA-5633 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5633 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 2.0 beta 1 >Reporter: sankalp kohli >Assignee: Sylvain Lebresne >Priority: Minor > Labels: cql3 > Fix For: 2.0.5 > > > This is currently supported via Thrift but not via CQL. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (CASSANDRA-6086) Node refuses to start with exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers when find that some to be removed files are already removed
[ https://issues.apache.org/jira/browse/CASSANDRA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6086: -- Fix Version/s: (was: 2.0.4) 2.0.5 > Node refuses to start with exception in > ColumnFamilyStore.removeUnfinishedCompactionLeftovers when find that some to > be removed files are already removed > - > > Key: CASSANDRA-6086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6086 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Yuki Morishita > Fix For: 2.0.5 > > Attachments: 6086-2.0-v3.txt, 6086-v2.txt, > removeUnfinishedCompactionLeftovers.txt > > > Node refuses to start with > {code} > Caused by: java.lang.IllegalStateException: Unfinished compactions reference > missing sstables. This should never happen since compactions are marked > finished before we start removing the old sstables. > at > org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:544) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:262) > {code} > IMO, there is no reason to refuse to start discivering files that must be > removed are already removed. It looks like pure bug diagnostic code and mean > nothing to operator (nor he can do anything about this). > Replaced throw of excepion with dump of diagnostic warning and continue > startup. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849236#comment-13849236 ] Eric Evans commented on CASSANDRA-6490: --- bq. Done (Eric Evans, can you check the debian/dists/ directory and delete the 06x and 07x directories? I don't seem to have the right to do so and they don't point at anything existing anymore). Done. > Please delete old releases from mirroring system > > > Key: CASSANDRA-6490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6490 > Project: Cassandra > Issue Type: Bug > Environment: http://www.apache.org/dist/cassandra/ >Reporter: Sebb >Assignee: Sylvain Lebresne > > To reduce the load on the ASF mirrors, projects are required to delete old > releases [1] > Please can you remove all non-current releases? > Thanks! > [Note that older releases are always available from the ASF archive server] > Any links to older releases on download pages should first be adjusted to > point to the archive server. > [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6190) Cassandra 2.0 won't start up with Java 7u40 with Client JVM. (works on Server JVM, and both JVMs 7u25)
[ https://issues.apache.org/jira/browse/CASSANDRA-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805661#comment-13805661 ] Eric Evans commented on CASSANDRA-6190: --- +1, lgtm > Cassandra 2.0 won't start up with Java 7u40 with Client JVM. (works on > Server JVM, and both JVMs 7u25) > --- > > Key: CASSANDRA-6190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6190 > Project: Cassandra > Issue Type: Bug > Components: Config > Environment: Ubuntu 13.04 32- and 64-bit JDK 7u40 (tried JRE 7u25) >Reporter: Steven Lowenthal >Assignee: Brandon Williams > Attachments: 6190.txt > > > Java 7u40 on some platforms do not recognize the the -XX:+UseCondCardMark JVM > option. 7u40 on Macintosh works correctly, If I use the tarball 7u40 > version of 7, we encounter the error below. I tried 7u25 (the previous > release) and it functioned correctly. > ubuntu@ubuntu:~$ Unrecognized VM option 'UseCondCardMark' > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6240) CLASSPATH logic from init script is unused, JNA isn't loaded
[ https://issues.apache.org/jira/browse/CASSANDRA-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805654#comment-13805654 ] Eric Evans commented on CASSANDRA-6240: --- This was fixed in the 2.0 branch as part of CASSANDRA-6101, and will be released as part of 2.0.2. There is one remaining issue with the init script that has gotten hung in review. [~paravoid] if you could have a look at CASSANDRA-6131 and comment on it there, I'd be very grateful (I'll buy you a cheesesteak sandwich in Portland next summer :)). > CLASSPATH logic from init script is unused, JNA isn't loaded > > > Key: CASSANDRA-6240 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6240 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Faidon Liambotis >Assignee: Eric Evans > > The init script has a classpath() function that collects all the jars and > even includes this piece of code to work with the standard Debian/Ubuntu > libjna-jar: > {code:none} > # use JNA if installed in standard location > [ -r /usr/share/java/jna.jar ] && cp="$cp:/usr/share/java/jna.jar" > {code} > This seems very nice and correct, however the classpath() function is never > called and is entirely unused :) Instead, /usr/bin/cassandra is called, which > in turn includes /usr/share/cassandra/cassandra.in.sh, which has basically > similar code to collect the jars for CLASSPATH but a) without the JNA > standard path trick b) without using EXTRA_CLASSPATH (from > /etc/default/cassandra) at all, so Cassandra boots without either JNA nor > EXTRA_CLASSPATH, contrary to expectations. > There are various suggestions on the web to do "ln -s /usr/share/java/jna.jar > /usr/share/cassandra/lib/"; I suspect this bug to be the reason for that. > /usr/share/cassandra/cassandra.in.sh seems smart enough to append but not > overwrite CLASSPATH, so fixing the init script's classpath() to only include > JNA + EXTRA_CLASSPATH (and making sure it's actually getting called :)) > should be enough for a fix. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (CASSANDRA-6240) CLASSPATH logic from init script is unused, JNA isn't loaded
[ https://issues.apache.org/jira/browse/CASSANDRA-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans resolved CASSANDRA-6240. --- Resolution: Fixed > CLASSPATH logic from init script is unused, JNA isn't loaded > > > Key: CASSANDRA-6240 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6240 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Faidon Liambotis >Assignee: Eric Evans > > The init script has a classpath() function that collects all the jars and > even includes this piece of code to work with the standard Debian/Ubuntu > libjna-jar: > {code:none} > # use JNA if installed in standard location > [ -r /usr/share/java/jna.jar ] && cp="$cp:/usr/share/java/jna.jar" > {code} > This seems very nice and correct, however the classpath() function is never > called and is entirely unused :) Instead, /usr/bin/cassandra is called, which > in turn includes /usr/share/cassandra/cassandra.in.sh, which has basically > similar code to collect the jars for CLASSPATH but a) without the JNA > standard path trick b) without using EXTRA_CLASSPATH (from > /etc/default/cassandra) at all, so Cassandra boots without either JNA nor > EXTRA_CLASSPATH, contrary to expectations. > There are various suggestions on the web to do "ln -s /usr/share/java/jna.jar > /usr/share/cassandra/lib/"; I suspect this bug to be the reason for that. > /usr/share/cassandra/cassandra.in.sh seems smart enough to append but not > overwrite CLASSPATH, so fixing the init script's classpath() to only include > JNA + EXTRA_CLASSPATH (and making sure it's actually getting called :)) > should be enough for a fix. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794113#comment-13794113 ] Eric Evans commented on CASSANDRA-6131: --- [~sebastianlacuesta]: that patch is meant to apply to the 2.0 branch (where it will it land); are you able to test against the 2.0 branch? > JAVA_HOME on cassandra-env.sh is ignored on Debian packages > --- > > Key: CASSANDRA-6131 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: I've just got upgraded to 2.0.1 package from the apache > repositories using apt. I had the JAVA_HOME environment variable set in > /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by > setting it on /usr/sbin/cassandra script. I can't configure java 7 system > wide, only for cassandra. > Off-toppic: Thanks for getting rid of the jsvc mess. >Reporter: Sebastián Lacuesta >Assignee: Eric Evans > Labels: debian > Fix For: 2.0.2 > > Attachments: 6131.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6131: -- Attachment: (was: 6131.patch) > JAVA_HOME on cassandra-env.sh is ignored on Debian packages > --- > > Key: CASSANDRA-6131 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: I've just got upgraded to 2.0.1 package from the apache > repositories using apt. I had the JAVA_HOME environment variable set in > /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by > setting it on /usr/sbin/cassandra script. I can't configure java 7 system > wide, only for cassandra. > Off-toppic: Thanks for getting rid of the jsvc mess. >Reporter: Sebastián Lacuesta >Assignee: Eric Evans > Labels: debian > Fix For: 2.0.2 > > Attachments: 6131.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6131: -- Attachment: 6131.patch Rebased to {{cassandra-2.0}} branch. > JAVA_HOME on cassandra-env.sh is ignored on Debian packages > --- > > Key: CASSANDRA-6131 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: I've just got upgraded to 2.0.1 package from the apache > repositories using apt. I had the JAVA_HOME environment variable set in > /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by > setting it on /usr/sbin/cassandra script. I can't configure java 7 system > wide, only for cassandra. > Off-toppic: Thanks for getting rid of the jsvc mess. >Reporter: Sebastián Lacuesta >Assignee: Eric Evans > Labels: debian > Fix For: 2.0.2 > > Attachments: 6131.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (CASSANDRA-6101) Debian init script broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans resolved CASSANDRA-6101. --- Resolution: Fixed > Debian init script broken > - > > Key: CASSANDRA-6101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6101 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Anton Winter >Assignee: Eric Evans >Priority: Minor > Attachments: 6101-classpath.patch, 6101.txt > > > The debian init script released in 2.0.1 contains 2 issues: > # The pidfile directory is not created if it doesn't already exist. > # Classpath not exported to the start-stop-daemon. > These lead to the init script not picking up jna.jar, or anything from the > debian EXTRA_CLASSPATH environment variable, and the init script not being > able to stop/restart Cassandra. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6101) Debian init script broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786152#comment-13786152 ] Eric Evans commented on CASSANDRA-6101: --- bq. That service cassandra status problem is resolved by CASSANDRA-6090 I think so too; Closing this issue > Debian init script broken > - > > Key: CASSANDRA-6101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6101 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Anton Winter >Assignee: Eric Evans >Priority: Minor > Attachments: 6101-classpath.patch, 6101.txt > > > The debian init script released in 2.0.1 contains 2 issues: > # The pidfile directory is not created if it doesn't already exist. > # Classpath not exported to the start-stop-daemon. > These lead to the init script not picking up jna.jar, or anything from the > debian EXTRA_CLASSPATH environment variable, and the init script not being > able to stop/restart Cassandra. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785471#comment-13785471 ] Eric Evans commented on CASSANDRA-6131: --- If setting {{JAVA_HOME}} from {{cassandra-env.sh}} ever worked before (from the Debian package), it was probably by accident, but there is no reason we can't support it going forward. For what it's worth, I'd probably recommend using {{/etc/default/cassandra}} for Debian/Ubuntu, but it will work with either. [~sebastianlacuesta], could you test the attached patch and let me know if this solves it for you? > JAVA_HOME on cassandra-env.sh is ignored on Debian packages > --- > > Key: CASSANDRA-6131 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: I've just got upgraded to 2.0.1 package from the apache > repositories using apt. I had the JAVA_HOME environment variable set in > /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by > setting it on /usr/sbin/cassandra script. I can't configure java 7 system > wide, only for cassandra. > Off-toppic: Thanks for getting rid of the jsvc mess. >Reporter: Sebastián Lacuesta >Assignee: Eric Evans > Labels: debian > Fix For: 2.0.2 > > Attachments: 6131.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6131: -- Attachment: 6131.patch > JAVA_HOME on cassandra-env.sh is ignored on Debian packages > --- > > Key: CASSANDRA-6131 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: I've just got upgraded to 2.0.1 package from the apache > repositories using apt. I had the JAVA_HOME environment variable set in > /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by > setting it on /usr/sbin/cassandra script. I can't configure java 7 system > wide, only for cassandra. > Off-toppic: Thanks for getting rid of the jsvc mess. >Reporter: Sebastián Lacuesta >Assignee: Eric Evans > Labels: debian > Fix For: 2.0.2 > > Attachments: 6131.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6131) JAVA_HOME on cassandra-env.sh is ignored on Debian packages
[ https://issues.apache.org/jira/browse/CASSANDRA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785278#comment-13785278 ] Eric Evans commented on CASSANDRA-6131: --- Could you expound on this a bit? Are you trying to _set_ {{JAVA_HOME}} from within {{cassandra-env.sh}}? > JAVA_HOME on cassandra-env.sh is ignored on Debian packages > --- > > Key: CASSANDRA-6131 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6131 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: I've just got upgraded to 2.0.1 package from the apache > repositories using apt. I had the JAVA_HOME environment variable set in > /etc/cassandra/cassandra-env.sh but after the upgrade it only worked by > setting it on /usr/sbin/cassandra script. I can't configure java 7 system > wide, only for cassandra. > Off-toppic: Thanks for getting rid of the jsvc mess. >Reporter: Sebastián Lacuesta >Assignee: Eric Evans > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (CASSANDRA-6116) /etc/init.d/cassandra stop and service don't work
[ https://issues.apache.org/jira/browse/CASSANDRA-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785255#comment-13785255 ] Eric Evans edited comment on CASSANDRA-6116 at 10/3/13 3:23 PM: This is probably a duplicate of CASSANDRA-6090 and/or CASSANDRA-6101; Could you try this again and let me know the result? Edit: Oh, and trying it again will require building a new package from the {{cassanra-2.0}} branch, let me know if you need help with this, or web access to a packaged snapshot. was (Author: urandom): This is probably a duplicate of CASSANDRA-6090 and/or CASSANDRA-6101; Could you try this again and let me know the result? > /etc/init.d/cassandra stop and service don't work > - > > Key: CASSANDRA-6116 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6116 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Cathy Daw >Assignee: Eric Evans >Priority: Minor > > These use to work in 2.0.0 and appears to be introduced in 2.0.1 > Test Scenario > {noformat} > # Start Server > automaton@ip-10-171-39-230:~$ sudo service cassandra start > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1862M -Xmx1862M > -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k > # Check Status > automaton@ip-10-171-39-230:~$ nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns Host ID > Rack > UN 127.0.0.1 81.72 KB 256 100.0% > e40ef77c-9cf7-4e27-b651-ede3b7269019 rack1 > # Check Status of service > automaton@ip-10-171-39-230:~$ sudo service cassandra status > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1862M -Xmx1862M > -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k > * Cassandra is not running > # Stop Server > automaton@ip-10-171-39-230:~$ sudo service cassandra stop > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1862M -Xmx1862M > -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k > # Verify Server is no longer up > automaton@ip-10-171-39-230:~$ nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns Host ID > Rack > UN 127.0.0.1 81.72 KB 256 100.0% > e40ef77c-9cf7-4e27-b651-ede3b7269019 rack1 > {noformat} > Installation Instructions > {noformat} > wget http://people.apache.org/~slebresne/cassandra_2.0.1_all.deb > sudo dpkg -i cassandra_2.0.1_all.deb # Error about dependencies > sudo apt-get -f install > sudo dpkg -i cassandra_2.0.1_all.deb > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6116) /etc/init.d/cassandra stop and service don't work
[ https://issues.apache.org/jira/browse/CASSANDRA-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785255#comment-13785255 ] Eric Evans commented on CASSANDRA-6116: --- This is probably a duplicate of CASSANDRA-6090 and/or CASSANDRA-6101; Could you try this again and let me know the result? > /etc/init.d/cassandra stop and service don't work > - > > Key: CASSANDRA-6116 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6116 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Cathy Daw >Assignee: Eric Evans >Priority: Minor > > These use to work in 2.0.0 and appears to be introduced in 2.0.1 > Test Scenario > {noformat} > # Start Server > automaton@ip-10-171-39-230:~$ sudo service cassandra start > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1862M -Xmx1862M > -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k > # Check Status > automaton@ip-10-171-39-230:~$ nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns Host ID > Rack > UN 127.0.0.1 81.72 KB 256 100.0% > e40ef77c-9cf7-4e27-b651-ede3b7269019 rack1 > # Check Status of service > automaton@ip-10-171-39-230:~$ sudo service cassandra status > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1862M -Xmx1862M > -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k > * Cassandra is not running > # Stop Server > automaton@ip-10-171-39-230:~$ sudo service cassandra stop > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1862M -Xmx1862M > -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k > # Verify Server is no longer up > automaton@ip-10-171-39-230:~$ nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns Host ID > Rack > UN 127.0.0.1 81.72 KB 256 100.0% > e40ef77c-9cf7-4e27-b651-ede3b7269019 rack1 > {noformat} > Installation Instructions > {noformat} > wget http://people.apache.org/~slebresne/cassandra_2.0.1_all.deb > sudo dpkg -i cassandra_2.0.1_all.deb # Error about dependencies > sudo apt-get -f install > sudo dpkg -i cassandra_2.0.1_all.deb > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6101) Debian init script broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785250#comment-13785250 ] Eric Evans commented on CASSANDRA-6101: --- bq. Not really important, but status isn't working with this patch.. I'm not seeing that [~pieterc]; Which patch are you referring to, [6101.txt|https://issues.apache.org/jira/secure/attachment/12605201/6101.txt] or [6101-classpatch.patch|https://issues.apache.org/jira/secure/attachment/12605920/6101-classpath.patch]? I wouldn't be surprised at this point to find there are more bugs with this, but only [6101.txt|https://issues.apache.org/jira/secure/attachment/12605201/6101.txt] should have had any impact on this, and it is definitely a change for the better. When you see this error, does a PID file exist at {{/var/run/cassandra/cassandra.pid}}? If so, what are the contents of the file? Is Cassandra running, and if so, what is its PID (hint: try {{pgrep -f CassandraDaemon}})? Could you attach the output of {{sh -x /etc/init.d/cassandra status}}? > Debian init script broken > - > > Key: CASSANDRA-6101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6101 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Anton Winter >Assignee: Eric Evans >Priority: Minor > Attachments: 6101-classpath.patch, 6101.txt > > > The debian init script released in 2.0.1 contains 2 issues: > # The pidfile directory is not created if it doesn't already exist. > # Classpath not exported to the start-stop-daemon. > These lead to the init script not picking up jna.jar, or anything from the > debian EXTRA_CLASSPATH environment variable, and the init script not being > able to stop/restart Cassandra. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6101) Debian init script broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785246#comment-13785246 ] Eric Evans commented on CASSANDRA-6101: --- bq. Yes, that works as well. Thanks Anton; Committed > Debian init script broken > - > > Key: CASSANDRA-6101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6101 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Anton Winter >Assignee: Eric Evans >Priority: Minor > Attachments: 6101-classpath.patch, 6101.txt > > > The debian init script released in 2.0.1 contains 2 issues: > # The pidfile directory is not created if it doesn't already exist. > # Classpath not exported to the start-stop-daemon. > These lead to the init script not picking up jna.jar, or anything from the > debian EXTRA_CLASSPATH environment variable, and the init script not being > able to stop/restart Cassandra. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (CASSANDRA-6090) init.d script not working under Ubuntu
[ https://issues.apache.org/jira/browse/CASSANDRA-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans resolved CASSANDRA-6090. --- Resolution: Fixed Partially committed, (the fix for directory creation was committed as part of CASSANDRA-6101); Thanks Laurent! > init.d script not working under Ubuntu > -- > > Key: CASSANDRA-6090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6090 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: Ubuntu 12.04.2 LTS x64 >Reporter: Laurent Raufaste >Assignee: Eric Evans >Priority: Minor > Fix For: 2.0.1 > > > When installing the Cassandra package on Ubuntu, it starts up automatically > without writing the PID file. > It renders the init.d script useless as it can't status or stop cassandra. > I submitted a PR on github to fix this: > https://github.com/apache/cassandra/pull/21 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6101) Debian init script broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13781978#comment-13781978 ] Eric Evans commented on CASSANDRA-6101: --- Good catch Anton. I've applied the fix to ensure that the PID directory is created, but I think we can go a bit further with the classpath. That function is a throwback to {{jsvc}} and duplicates the classpath construction we do elsewhere. I've attached [6101-classpath.patch|https://issues.apache.org/jira/secure/attachment/12605920/6101-classpath.patch] which eliminates that function and uses {{EXTRA_CLASSPATH}} instead. Can you give this a whirl and see if it (still) fixes the issues you were seeing? > Debian init script broken > - > > Key: CASSANDRA-6101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6101 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Anton Winter >Assignee: Eric Evans >Priority: Minor > Attachments: 6101-classpath.patch, 6101.txt > > > The debian init script released in 2.0.1 contains 2 issues: > # The pidfile directory is not created if it doesn't already exist. > # Classpath not exported to the start-stop-daemon. > These lead to the init script not picking up jna.jar, or anything from the > debian EXTRA_CLASSPATH environment variable, and the init script not being > able to stop/restart Cassandra. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6101) Debian init script broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-6101: -- Attachment: 6101-classpath.patch > Debian init script broken > - > > Key: CASSANDRA-6101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6101 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Anton Winter >Assignee: Eric Evans >Priority: Minor > Attachments: 6101-classpath.patch, 6101.txt > > > The debian init script released in 2.0.1 contains 2 issues: > # The pidfile directory is not created if it doesn't already exist. > # Classpath not exported to the start-stop-daemon. > These lead to the init script not picking up jna.jar, or anything from the > debian EXTRA_CLASSPATH environment variable, and the init script not being > able to stop/restart Cassandra. -- This message was sent by Atlassian JIRA (v6.1#6144)