[jira] [Commented] (CASSANDRA-11724) False Failure Detection in Big Cassandra Cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15535240#comment-15535240 ] Jackson Chung commented on CASSANDRA-11724: --- i believe we ran into this as well (along with CASSANDRA-10371 ) no i don't have a test case , sorry. to "fix" CASSANDRA-10371 , rolling restart appeared work (will monitor for couple more days). But for this issue, FailureDetector jmx attribute shows an IP as DOWN even it was properly decommissioned (no hang, didn't need to do removenode). getEndpointState (or gossipinfo) shows: {noformat} /10.30.10.146 generation:1459192362 heartbeat:48032911 RACK:10:r1 NET_VERSION:1:7 LOAD:48032807:8.68526498837E11 SEVERITY:48032913:0.0 HOST_ID:2:e96fdd2b-73a0-4579-bc04-3b60a557c2d3 STATUS:24603149:LEFT,13028853640594434189771438209987024084,1475453013098 DC:8:DC_OREGON_OFFLINE SCHEMA:46105309:db7592b0-5047-3595-bfea-e3efce1aa75f RELEASE_VERSION:4:2.0.17 INTERNAL_IP:6:10.30.10.146 RPC_ADDRESS:3:10.30.10.146 TOKENS:15: {noformat} > False Failure Detection in Big Cassandra Cluster > > > Key: CASSANDRA-11724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeffrey F. Lukman > Labels: gossip, node-failure > Attachments: Workload1.jpg, Workload2.jpg, Workload3.jpg, > Workload4.jpg, experiment-result.txt > > > We are running some testing on Cassandra v2.2.5 stable in a big cluster. The > setting in our testing is that each machine has 16-cores and runs 8 cassandra > instances, and our testing is 32, 64, 128, 256, and 512 instances of > Cassandra. We use the default number of vnodes for each instance which is > 256. The data and log directories are on in-memory tmpfs file system. > We run several types of workloads on this Cassandra cluster: > Workload1: Just start the cluster > Workload2: Start half of the cluster, wait until it gets into a stable > condition, and run another half of the cluster > Workload3: Start half of the cluster, wait until it gets into a stable > condition, load some data, and run another half of the cluster > Workload4: Start the cluster, wait until it gets into a stable condition, > load some data and decommission one node > For this testing, we measure the total numbers of false failure detection > inside the cluster. By false failure detection, we mean that, for example, > instance-1 marks the instance-2 down, but the instance-2 is not down. We dig > deeper into the root cause and find out that instance-1 has not received any > heartbeat after some time from instance-2 because the instance-2 run a long > computation process. > Here I attach the graphs of each workload result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8874) running out of FD, and causing clients hang when dropping a keyspace with many CF with many sstables
Jackson Chung created CASSANDRA-8874: Summary: running out of FD, and causing clients hang when dropping a keyspace with many CF with many sstables Key: CASSANDRA-8874 URL: https://issues.apache.org/jira/browse/CASSANDRA-8874 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung we already set number of file descriptors to 10 for c* usage, and confirmed that from /proc/$cass_pid/limits we have 16 nodes, 2 DC, each node stores about 600GB to 1TB data; ec2, i2-2xl instances, raid0 the 2 disks we use both hector and datastax drivers, and there are many clients connecting to the cluster. 1 day we dropped a keyspace (that our app no longer use), which has a good amount of CFs, with some of them use leveledbcompaction and have some good amount of sstables... and our app went down. CPU/load avg were high and we couldn't even ssh to them. We have to force a reboot, and restart 2 of the C*, that was filled (hundreds of thousands) of errors of too many open files C* 2.0.11 {noformat}$ grep -ic caused by.*too many open file system.log.* system.log.1:0 system.log.10:18659 system.log.11:17539 system.log.12:18941 system.log.13:18936 system.log.14:18601 system.log.15:18933 system.log.16:18937 system.log.17:18954 system.log.18:18892 system.log.19:18942 system.log.2:0 system.log.20:18977 system.log.21:18977 system.log.22:18852 system.log.23:18978 system.log.24:18978 system.log.25:18978 system.log.26:18978 system.log.27:18978 system.log.28:18978 system.log.29:18978 system.log.3:654 system.log.30:18978 system.log.31:18978 system.log.32:18978 system.log.33:18977 system.log.34:18978 system.log.35:18978 system.log.36:17943 system.log.37:18867 system.log.38:15082 system.log.39:17766 system.log.4:17932 system.log.40:18029 system.log.41:18890 system.log.42:18048 system.log.43:18812 system.log.44:18787 system.log.45:18962 system.log.46:18978 system.log.47:18978 system.log.48:18978 system.log.49:18978 system.log.5:15284 system.log.50:18978 system.log.6:17180 system.log.7:17286 system.log.8:18651 system.log.9:17720 {noformat} all the logs are from that day.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7467) flood of setting live ratio to maximum of 64 from repair
[ https://issues.apache.org/jira/browse/CASSANDRA-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047150#comment-14047150 ] Jackson Chung commented on CASSANDRA-7467: -- please ignore prev comment. 2 nodes misbehaved over night while repair was running (on another node), and crontab flush is already disabled. flood of setting live ratio to maximum of 64 from repair -- Key: CASSANDRA-7467 URL: https://issues.apache.org/jira/browse/CASSANDRA-7467 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung we are on 2.0.8 running with repair -pr -local KS, all nodes on i2.2x (60G ram);, with setting 8G of heap. Using java 8. (key cache size is 1G) On occasion, when repair is run, the C* that run the repair, or another node in the cluster, or both, run into a bad state with the system.log just printing setting live ratio to maximum of 64 forever every split seconds. It usually happens when repairing one of the larger/wider CF. WARN [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 470) setting live ratio to maximum of 64.0 instead of Infinity INFO [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 481) CFS(Keyspace='RIQ', ColumnFamily='MemberTimeline') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells Table: MemberTimeline SSTable count: 13 Space used (live), bytes: 17644018786 ... Compacted partition minimum bytes: 30 Compacted partition maximum bytes: 464228842 Compacted partition mean bytes: 54578 Just to give an idea of how bad this is, the log file is set to rotate 50 times with 21M each. In less than 15 minutes, all the logs are filled up with just that log. C* is not responding, and can't be killed normally. Only way is to kill -9 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7467) flood of setting live ratio to maximum of 64 from repair
Jackson Chung created CASSANDRA-7467: Summary: flood of setting live ratio to maximum of 64 from repair Key: CASSANDRA-7467 URL: https://issues.apache.org/jira/browse/CASSANDRA-7467 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung we are on 2.0.8 running with repair -pr -local KS, all nodes on i2.2x (60G ram);, with setting 8G of heap. Using java 8. (key cache size is 1G) On occasion, when repair is run, the C* that run the repair, or another node in the cluster, or both, run into a bad state with the system.log just printing setting live ratio to maximum of 64 forever every split seconds. It usually happens when repairing one of the larger/wider CF. WARN [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 470) setting live ratio to maximum of 64.0 instead of Infinity INFO [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 481) CFS(Keyspace='RIQ', ColumnFamily='MemberTimeline') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells Table: MemberTimeline SSTable count: 13 Space used (live), bytes: 17644018786 ... Compacted partition minimum bytes: 30 Compacted partition maximum bytes: 464228842 Compacted partition mean bytes: 54578 Just to give an idea of how bad this is, the log file is set to rotate 50 times with 21M each. In less than 15 minutes, all the logs are filled up with just that log. C* is not responding, and can't be killed normally. Only way is to kill -9 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7467) flood of setting live ratio to maximum of 64 from repair
[ https://issues.apache.org/jira/browse/CASSANDRA-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046975#comment-14046975 ] Jackson Chung commented on CASSANDRA-7467: -- We have some cron job that periodically run flush on the entire keyspace, so could be that flood of setting live ratio to maximum of 64 from repair -- Key: CASSANDRA-7467 URL: https://issues.apache.org/jira/browse/CASSANDRA-7467 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung we are on 2.0.8 running with repair -pr -local KS, all nodes on i2.2x (60G ram);, with setting 8G of heap. Using java 8. (key cache size is 1G) On occasion, when repair is run, the C* that run the repair, or another node in the cluster, or both, run into a bad state with the system.log just printing setting live ratio to maximum of 64 forever every split seconds. It usually happens when repairing one of the larger/wider CF. WARN [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 470) setting live ratio to maximum of 64.0 instead of Infinity INFO [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 481) CFS(Keyspace='RIQ', ColumnFamily='MemberTimeline') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells Table: MemberTimeline SSTable count: 13 Space used (live), bytes: 17644018786 ... Compacted partition minimum bytes: 30 Compacted partition maximum bytes: 464228842 Compacted partition mean bytes: 54578 Just to give an idea of how bad this is, the log file is set to rotate 50 times with 21M each. In less than 15 minutes, all the logs are filled up with just that log. C* is not responding, and can't be killed normally. Only way is to kill -9 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7467) flood of setting live ratio to maximum of 64 from repair
[ https://issues.apache.org/jira/browse/CASSANDRA-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047034#comment-14047034 ] Jackson Chung commented on CASSANDRA-7467: -- disabled the flush keyspace cron job, it does seem help. However, during the repair those lines would still be logged frequently (due to stream from the repair?). But at least it is not in a infinite-loop style. flood of setting live ratio to maximum of 64 from repair -- Key: CASSANDRA-7467 URL: https://issues.apache.org/jira/browse/CASSANDRA-7467 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung we are on 2.0.8 running with repair -pr -local KS, all nodes on i2.2x (60G ram);, with setting 8G of heap. Using java 8. (key cache size is 1G) On occasion, when repair is run, the C* that run the repair, or another node in the cluster, or both, run into a bad state with the system.log just printing setting live ratio to maximum of 64 forever every split seconds. It usually happens when repairing one of the larger/wider CF. WARN [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 470) setting live ratio to maximum of 64.0 instead of Infinity INFO [MemoryMeter:1] 2014-06-28 09:13:24,540 Memtable.java (line 481) CFS(Keyspace='RIQ', ColumnFamily='MemberTimeline') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells Table: MemberTimeline SSTable count: 13 Space used (live), bytes: 17644018786 ... Compacted partition minimum bytes: 30 Compacted partition maximum bytes: 464228842 Compacted partition mean bytes: 54578 Just to give an idea of how bad this is, the log file is set to rotate 50 times with 21M each. In less than 15 minutes, all the logs are filled up with just that log. C* is not responding, and can't be killed normally. Only way is to kill -9 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7363) PropertyFileSnitch should allow name address that does not yet exist
[ https://issues.apache.org/jira/browse/CASSANDRA-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025878#comment-14025878 ] Jackson Chung commented on CASSANDRA-7363: -- bq. This isn't possible, it needs to know the IP. Being able to specify a hostname is a convenience. maybe this makes more sense if there is completely no known host at all ? iow , only throw the ConfigurationException if the reloadedMap is empty in the end? In aws, since we are not reserving IP, it is not possible to know the ips ahead of time. We could use a workaround as starting the instances with those hostname first without starting C*, but, preferably not. Another flaw about the logic of needs to know the IP is the current check relies on {code} host = InetAddress.getByName(hostString); {code} This implies If a literal IP address is supplied, only the validity of the address format is checked., ie: not necessary the existence of that IP. So it doesn't seem to make sense that a non-existing IP will succeed; but a non-existing hostname will fail PropertyFileSnitch should allow name address that does not yet exist Key: CASSANDRA-7363 URL: https://issues.apache.org/jira/browse/CASSANDRA-7363 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung when starting a new node with PropertyFileSnitch with cassandra-topology.properties contains an unknown host, it fails with: {noformat} ERROR [main] 2014-06-06 17:48:38,233 DatabaseDescriptor.java (line 116) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Error instantiating snitch class 'org.apache.cassandra.locator.PropertyFileSnitch'. at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:503) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:506) at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:341) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:155) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569) Caused by: org.apache.cassandra.exceptions.ConfigurationException: Unknown host cassandra11-staging.amz.relateiq.com at org.apache.cassandra.locator.PropertyFileSnitch.reloadConfiguration(PropertyFileSnitch.java:174) at org.apache.cassandra.locator.PropertyFileSnitch.init(PropertyFileSnitch.java:60) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at java.lang.Class.newInstance(Class.java:433) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:488) ... 6 more Caused by: java.net.UnknownHostException: cassandra11-staging.amz.relateiq.com: unknown error at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:907) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302) at java.net.InetAddress.getAllByName0(InetAddress.java:1255) at java.net.InetAddress.getAllByName(InetAddress.java:1171) at java.net.InetAddress.getAllByName(InetAddress.java:1105) at java.net.InetAddress.getByName(InetAddress.java:1055) at org.apache.cassandra.locator.PropertyFileSnitch.reloadConfiguration(PropertyFileSnitch.java:170) ... 13 more {noformat} The real impact here is we are trying to launch a number of new nodes (via chef) with pre-configured hostname (and among other variables). The additional hostname (but not yet alive) made no impact to the existing nodes, which is good (looks like we only catch the ConfigurationException in the watcher thread, but not on the initial start); but it causes new node fail to start. (Pretty sure if we restart an existing one, it will fail too). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7363) PropertyFileSnitch should allow name address that does not yet exist
Jackson Chung created CASSANDRA-7363: Summary: PropertyFileSnitch should allow name address that does not yet exist Key: CASSANDRA-7363 URL: https://issues.apache.org/jira/browse/CASSANDRA-7363 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung when starting a new node with PropertyFileSnitch with cassandra-topology.properties contains an unknown host, it fails with: {noformat} ERROR [main] 2014-06-06 17:48:38,233 DatabaseDescriptor.java (line 116) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Error instantiating snitch class 'org.apache.cassandra.locator.PropertyFileSnitch'. at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:503) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:506) at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:341) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:155) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569) Caused by: org.apache.cassandra.exceptions.ConfigurationException: Unknown host cassandra11-staging.amz.relateiq.com at org.apache.cassandra.locator.PropertyFileSnitch.reloadConfiguration(PropertyFileSnitch.java:174) at org.apache.cassandra.locator.PropertyFileSnitch.init(PropertyFileSnitch.java:60) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at java.lang.Class.newInstance(Class.java:433) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:488) ... 6 more Caused by: java.net.UnknownHostException: cassandra11-staging.amz.relateiq.com: unknown error at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:907) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302) at java.net.InetAddress.getAllByName0(InetAddress.java:1255) at java.net.InetAddress.getAllByName(InetAddress.java:1171) at java.net.InetAddress.getAllByName(InetAddress.java:1105) at java.net.InetAddress.getByName(InetAddress.java:1055) at org.apache.cassandra.locator.PropertyFileSnitch.reloadConfiguration(PropertyFileSnitch.java:170) ... 13 more {noformat} The real impact here is we are trying to launch a number of new nodes (via chef) with pre-configured hostname (and among other variables). The additional hostname (but not yet alive) made no impact to the existing nodes, which is good (looks like we only catch the ConfigurationException in the watcher thread, but not on the initial start); but it causes new node fail to start. (Pretty sure if we restart an existing one, it will fail too). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7090) Add ability to set/get logging levels to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983296#comment-13983296 ] Jackson Chung commented on CASSANDRA-7090: -- hey [~vijay2...@gmail.com] , the diff file suggested it is modifying based on the logback, which is not applicable to 2.0. Did you mean to say rebase to 2.1 instead? Add ability to set/get logging levels to nodetool -- Key: CASSANDRA-7090 URL: https://issues.apache.org/jira/browse/CASSANDRA-7090 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jackson Chung Assignee: Jackson Chung Priority: Minor Fix For: 2.0.8, 2.1 beta2 Attachments: 0001-CASSANDRA-7090.patch, logging.diff, patch-7090.v20 While it is nice to use logback (per #CASSANDRA-5883) and with the autoreload feature, in some cases ops/admin may not have the permission or ability to modify the configuration file(s). Or the files are controlled by puppet/chef so it is not desirable to modify them manually. There is already an existing operation for setLoggingLevel in the StorageServuceMBean , so that's easy to expose that to the nodetool But what was lacking was ability to see the current log level settings for various loggers. The attached diff aims to do 3 things: # add JMX getLoggingLevels -- return a map of current loggers and the corresponding levels # expose both getLoggingLevels and setLoggingLevel to nodetool. In particular, the setLoggingLevel behave as follows: #* If both classQualifer and level are empty/null, it will reload the configuration to reset. #* If classQualifer is not empty but level is empty/null, it will set the level to null for the defined classQualifer #* The logback configuration should have jmxConfigurator / set The diff is based on the master branch which uses logback, soit is not applicable to 2.0 or 1.2. (2.1 is ok) Though it would be nice to have the same ability for 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7090) Add ability to set/get logging levels to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983316#comment-13983316 ] Jackson Chung commented on CASSANDRA-7090: -- right, that is why i have made 2 patches: https://issues.apache.org/jira/secure/attachment/12642001/patch-7090.v20 -- based on 2.0 https://issues.apache.org/jira/secure/attachment/12641851/logging.diff -- based on master Add ability to set/get logging levels to nodetool -- Key: CASSANDRA-7090 URL: https://issues.apache.org/jira/browse/CASSANDRA-7090 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jackson Chung Assignee: Jackson Chung Priority: Minor Fix For: 2.0.8, 2.1 beta2 Attachments: 0001-CASSANDRA-7090.patch, logging.diff, patch-7090.v20 While it is nice to use logback (per #CASSANDRA-5883) and with the autoreload feature, in some cases ops/admin may not have the permission or ability to modify the configuration file(s). Or the files are controlled by puppet/chef so it is not desirable to modify them manually. There is already an existing operation for setLoggingLevel in the StorageServuceMBean , so that's easy to expose that to the nodetool But what was lacking was ability to see the current log level settings for various loggers. The attached diff aims to do 3 things: # add JMX getLoggingLevels -- return a map of current loggers and the corresponding levels # expose both getLoggingLevels and setLoggingLevel to nodetool. In particular, the setLoggingLevel behave as follows: #* If both classQualifer and level are empty/null, it will reload the configuration to reset. #* If classQualifer is not empty but level is empty/null, it will set the level to null for the defined classQualifer #* The logback configuration should have jmxConfigurator / set The diff is based on the master branch which uses logback, soit is not applicable to 2.0 or 1.2. (2.1 is ok) Though it would be nice to have the same ability for 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7090) Add ability to set/get logging levels to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-7090: - Attachment: patch-7090.v20 attaching a v2.0 version (since we won't be using 2.1 soon, our next upgrade is 2.0) sample nodetool output (obviously the real value is the logging actually work, and it does) {noformat} jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool getLoggingLevels Logger NameLog Level rootINFO org.apache.thrift.server.TNonblockingServerERROR jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool setlogginglevel org.apache.cassandra.gms TRACE jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool setlogginglevel org.apache.cassandra.service DEBUG jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool getLoggingLevels Logger NameLog Level rootINFO org.apache.cassandra.service DEBUG org.apache.thrift.server.TNonblockingServerERROR org.apache.cassandra.gms TRACE jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool setlogginglevel org.apache.cassandra.service jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool getLoggingLevels Logger NameLog Level rootINFO org.apache.thrift.server.TNonblockingServerERROR org.apache.cassandra.gms TRACE jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool setlogginglevel org.apache.cassandra.db BLAH jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool getLoggingLevels Logger NameLog Level rootINFO org.apache.thrift.server.TNonblockingServerERROR org.apache.cassandra.dbDEBUG org.apache.cassandra.gms TRACE jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool setlogginglevel jackson@faranth:~/work/cassandra-2.0$ ./bin/nodetool getLoggingLevels Logger NameLog Level rootINFO org.apache.thrift.server.TNonblockingServerERROR {noformat} Add ability to set/get logging levels to nodetool -- Key: CASSANDRA-7090 URL: https://issues.apache.org/jira/browse/CASSANDRA-7090 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: logging.diff, patch-7090.v20 While it is nice to use logback (per #CASSANDRA-5883) and with the autoreload feature, in some cases ops/admin may not have the permission or ability to modify the configuration file(s). Or the files are controlled by puppet/chef so it is not desirable to modify them manually. There is already an existing operation for setLoggingLevel in the StorageServuceMBean , so that's easy to expose that to the nodetool But what was lacking was ability to see the current log level settings for various loggers. The attached diff aims to do 3 things: # add JMX getLoggingLevels -- return a map of current loggers and the corresponding levels # expose both getLoggingLevels and setLoggingLevel to nodetool. In particular, the setLoggingLevel behave as follows: #* If both classQualifer and level are empty/null, it will reload the configuration to reset. #* If classQualifer is not empty but level is empty/null, it will set the level to null for the defined classQualifer #* The logback configuration should have jmxConfigurator / set The diff is based on the master branch which uses logback, soit is not applicable to 2.0 or 1.2. (2.1 is ok) Though it would be nice to have the same ability for 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7090) Add ability to set/get logging levels to nodetool
Jackson Chung created CASSANDRA-7090: Summary: Add ability to set/get logging levels to nodetool Key: CASSANDRA-7090 URL: https://issues.apache.org/jira/browse/CASSANDRA-7090 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: logging.diff While it is nice to use logback (per #CASSANDRA-5883) and with the autoreload feature, in some cases ops/admin may not have the permission or ability to modify the configuration file(s). Or the files are controlled by puppet/chef so it is not desirable to modify them manually. There is already an existing operation for setLoggingLevel in the StorageServuceMBean , so that's easy to expose that to the nodetool But what was lacking was ability to see the current log level settings for various loggers. The attached diff aims to do 3 things: # add JMX getLoggingLevels -- return a map of current loggers and the corresponding levels # expose both getLoggingLevels and setLoggingLevel to nodetool. In particular, the setLoggingLevel behave as follows: #* If both classQualifer and level are empty/null, it will reload the configuration to reset. #* If classQualifer is not empty but level is empty/null, it will set the level to null for the defined classQualifer #* The logback configuration should have jmxConfigurator / set The diff is based on the master branch which uses logback, soit is not applicable to 2.0 or 1.2. (2.1 is ok) Though it would be nice to have the same ability for 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6415) Snapshot repair blocks for ever if something happens to the I made my snapshot response
[ https://issues.apache.org/jira/browse/CASSANDRA-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974462#comment-13974462 ] Jackson Chung commented on CASSANDRA-6415: -- I ran into the stuck issue on 1.2.10 Upgraded to 1.2.16, I could see repair is not stuck, in a sense I see multiple repair sessions/stages started and finished. But, in the end (after waiting a long time), I see that there is no more activity from the log, and also compactionstats/netstats, but yet the tpstats still show Active and Pending count in the stages: AntiEntropyStage 1 2 5073 0 0 AntiEntropySessions 1 1 44 0 0 Snapshot repair blocks for ever if something happens to the I made my snapshot response - Key: CASSANDRA-6415 URL: https://issues.apache.org/jira/browse/CASSANDRA-6415 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Assignee: Yuki Morishita Labels: repair Fix For: 1.2.13, 2.0.4 Attachments: 6415-1.2.txt The snapshotLatch.await(); can be waiting for ever and block all repair operations indefinitely if something happens that another node doesn't respond. {noformat} public void makeSnapshots(CollectionInetAddress endpoints) { try { snapshotLatch = new CountDownLatch(endpoints.size()); IAsyncCallback callback = new IAsyncCallback() { public boolean isLatencyForSnitch() { return false; } public void response(MessageIn msg) { RepairJob.this.snapshotLatch.countDown(); } }; for (InetAddress endpoint : endpoints) MessagingService.instance().sendRR(new SnapshotCommand(tablename, cfname, sessionName, false).createMessage(), endpoint, callback); snapshotLatch.await(); snapshotLatch = null; } catch (InterruptedException e) { throw new RuntimeException(e); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6016) Ability to change replication factor for the trace keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826900#comment-13826900 ] Jackson Chung commented on CASSANDRA-6016: -- will there be a further 1.2(.12) release? if so, can this be put in 1.2 please? thx Ability to change replication factor for the trace keyspace --- Key: CASSANDRA-6016 URL: https://issues.apache.org/jira/browse/CASSANDRA-6016 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jeremiah Jordan Assignee: Yuki Morishita Priority: Minor Fix For: 2.0.2 Attachments: 6016-2.0-v2.txt, 6016-2.0-v3.txt, 6016-2.0.txt They trace keyspace is currently RF=1, and can't be changed. I want to be able to trace stuff when nodes are down/being stupid. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-6289) Murmur3Partitioner doesn't yield proper ownership calculation
Jackson Chung created CASSANDRA-6289: Summary: Murmur3Partitioner doesn't yield proper ownership calculation Key: CASSANDRA-6289 URL: https://issues.apache.org/jira/browse/CASSANDRA-6289 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung In a new 1.2 install with Murmur3 as default, I setup a test cluster with N=RF=3 for the cluster size and RF for a keyspace but when I look at the ring output (with the keyspace name), to my surprise it shows RF=2. Further investigate shows the total replica is an addition of the float value from the effectiveOwnership. But that results in 1 for the setup: {panel} #bean is set to org.apache.cassandra.db:type=StorageService $run effectiveOwnership Keyspace1 #calling operation effectiveOwnership of mbean org.apache.cassandra.db:type=StorageService #operation returns: \{ /127.0.0.1 = 0.989; /127.0.0.2 = 0.989; /127.0.0.3 = 0.989; \} {panel} {panel} $ ./bin/nodetool -h 0 -p 7100 ring Keyspace1 Datacenter: datacenter1 == Replicas: 2 AddressRackStatus State LoadOwnsToken 3074457345618258602 127.0.0.1 rack1 Up Normal 1.02 GB 100.00% -9223372036854775808 127.0.0.2 rack1 Up Normal 996.38 MB 100.00% -3074457345618258603 127.0.0.3 rack1 Up Normal 980.55 MB 100.00% 3074457345618258602 {panel} {panel} Keyspace: Keyspace1: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:3] {panel} The println would simply class the float value to int, so i guess that's round down. When using RandomPartitioner, the effectiveOwnership will return 1.0 So I guess the real question is, is the Murmur3 calculation correct? Or is it losing precision? If it is correct, then I guess we need to force the float - int to round up? (is that even the right thing to do?) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-6258) Need the root clause in FBUtilities.classForName when there is exception loading class
Jackson Chung created CASSANDRA-6258: Summary: Need the root clause in FBUtilities.classForName when there is exception loading class Key: CASSANDRA-6258 URL: https://issues.apache.org/jira/browse/CASSANDRA-6258 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung We have a custom snitch that works in 1.1, but the same does not work in 1.2 . It throws a ConfigurationException: {panel} ERROR 11:39:37,936 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:432) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:444) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:530) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:350) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:216) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' {panel} However the above exception does not help us understand what's wrong (jar is in the classpath and readable). I've to add the root clause to the ConfigurationException to see the real problem: {panel} ERROR 11:42:20,020 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:432) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:444) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:530) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:350) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:216) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) Caused by: java.lang.NoClassDefFoundError: org/apache/cassandra/config/ConfigurationException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:169) at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:424) ... 7 more Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.config.ConfigurationException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 10 more Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' {panel} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6258) Need the root clause in FBUtilities.classForName when there is exception loading class
[ https://issues.apache.org/jira/browse/CASSANDRA-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-6258: - Attachment: cassandra-6258.patch attaching diff (based off 1.2) to add the root clause when throwing ConfigurationException Need the root clause in FBUtilities.classForName when there is exception loading class -- Key: CASSANDRA-6258 URL: https://issues.apache.org/jira/browse/CASSANDRA-6258 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Attachments: cassandra-6258.patch We have a custom snitch that works in 1.1, but the same does not work in 1.2 . It throws a ConfigurationException: {panel} ERROR 11:39:37,936 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:432) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:444) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:530) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:350) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:216) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' {panel} However the above exception does not help us understand what's wrong (jar is in the classpath and readable). I've to add the root clause to the ConfigurationException to see the real problem: {panel} ERROR 11:42:20,020 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:432) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:444) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:530) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:350) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:216) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) Caused by: java.lang.NoClassDefFoundError: org/apache/cassandra/config/ConfigurationException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:169) at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:424) ... 7 more Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.config.ConfigurationException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 10 more Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' {panel} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-6258) Need the root clause in FBUtilities.classForName when there is exception loading class
[ https://issues.apache.org/jira/browse/CASSANDRA-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-6258: - Attachment: cassandra-6258.patch.v2 ops, diff too much. This one should be clean Need the root clause in FBUtilities.classForName when there is exception loading class -- Key: CASSANDRA-6258 URL: https://issues.apache.org/jira/browse/CASSANDRA-6258 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Attachments: cassandra-6258.patch, cassandra-6258.patch.v2 We have a custom snitch that works in 1.1, but the same does not work in 1.2 . It throws a ConfigurationException: {panel} ERROR 11:39:37,936 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:432) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:444) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:530) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:350) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:216) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' {panel} However the above exception does not help us understand what's wrong (jar is in the classpath and readable). I've to add the root clause to the ConfigurationException to see the real problem: {panel} ERROR 11:42:20,020 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:432) at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:444) at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:530) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:350) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:216) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) Caused by: java.lang.NoClassDefFoundError: org/apache/cassandra/config/ConfigurationException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:169) at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:424) ... 7 more Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.config.ConfigurationException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 10 more Unable to find snitch class 'com.apigee.cassandra.OldEC2Snitch' {panel} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-6241) Assertion on MmappedSegmentedFile.floor doesn't tell us the path (filename)
Jackson Chung created CASSANDRA-6241: Summary: Assertion on MmappedSegmentedFile.floor doesn't tell us the path (filename) Key: CASSANDRA-6241 URL: https://issues.apache.org/jira/browse/CASSANDRA-6241 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung for whatever reason (hardware failure, excess load, etc), we get this: {panel} ERROR [MutationStage:10] 2013-10-25 08:54:03,150 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[MutationStage:10,5,main] java.lang.AssertionError: 1711300 vs 974637 at org.apache.cassandra.io.util.MmappedSegmentedFile.floor(MmappedSegmentedFile.java:62) at org.apache.cassandra.io.util.MmappedSegmentedFile.getSegment(MmappedSegmentedFile.java:77) at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:900) at org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:63) at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:61) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79) at org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:124) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1362) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1224) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1154) at org.apache.cassandra.db.Table.readCurrentIndexedColumns(Table.java:514) at org.apache.cassandra.db.Table.apply(Table.java:452) at org.apache.cassandra.db.Table.apply(Table.java:384) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:294) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) {panel} But the assertion error doesn't tell us the SSTable that is having the problem. So it doesn't really help us. I think we can simply append the assert error log msg with the this.path to show the filename of the problematic file. I would also suggest make 1711300 vs 974637 more clear... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (CASSANDRA-5924) If migration (upgrade) failed mid-way, some data will be lost on the upgraded instance
Jackson Chung created CASSANDRA-5924: Summary: If migration (upgrade) failed mid-way, some data will be lost on the upgraded instance Key: CASSANDRA-5924 URL: https://issues.apache.org/jira/browse/CASSANDRA-5924 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung When upgrading from 1.0 to 1.1, C* checks from the system keyspace (schema_keyspaces) to see if a migration is needed. When it is needed, it proceeds with migrate migrateSSTables. But this process does not have any particular order (File.listFiles() has no guarantee order), and IOException can be thrown (eg fail to create directory). In some of our upgrades, system was migrated first, followed by some KSs/CFs, but before it finishes all the KSs/CFs, it failed on a custom directory, with files in this directory that similar to sstables file convention (contains -). They really shouldn't be there and we are removing them. But this results in C* tried to create directory for this file, but it fails, because of ownership/permission, with IOException. As a result C* failed to start. Without knowing why C* failed to start to begin with, C* was restarted. Only this time C* does not think it needs to migrate any more (system already migrated, so schema_keyspaces exists). This results in the those remaining KS/CF failed to be migrated. Our root cause is because of the custom directory and the ownership/permission of it, and again we are removing them to re-upgrade. But the purpose of this jira is IOException (or any other exception) can still be thrown for various reasons during this process, and can result in the same problem: some CF failed to be migrated. 1.2 seems to have some handling codes, but it looks like a RuntimeException would still be thrown, and that would still be caught by the AbstractCassandraDaemon (or CassandraDaemon if 1.2) : {code} catch (Throwable e) { logger.error(Exception encountered during startup, e); // try to warn user on stdout too, if we haven't already detached e.printStackTrace(); System.out.println(Exception encountered during startup: + e.getMessage()); System.exit(3); } {code} And so I think this problem still exists in 1.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5289) DatabaseDescriptor.hasExistingNoSystemTables return true even with only system keyspace
Jackson Chung created CASSANDRA-5289: Summary: DatabaseDescriptor.hasExistingNoSystemTables return true even with only system keyspace Key: CASSANDRA-5289 URL: https://issues.apache.org/jira/browse/CASSANDRA-5289 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Priority: Minor The hasExistingNoSystemTables method in DatabaseDescriptor checks for directory only. On a new start, system KS would be created. This method current return true because of it, resulting incorrect/confusing log: logger.info(Found table data in data directories. Consider using the CLI to define your schema.) ; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Deleted] (CASSANDRA-5289) DatabaseDescriptor.hasExistingNoSystemTables return true even with only system keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-5289: - Comment: was deleted (was: patch to add check the name not equal to Table.SYSTEM_KS patch based on 1.1 branch note for trunk, Table.SYSTEM_KS is changed to Table.SYSTEM_TABLE) DatabaseDescriptor.hasExistingNoSystemTables return true even with only system keyspace --- Key: CASSANDRA-5289 URL: https://issues.apache.org/jira/browse/CASSANDRA-5289 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Priority: Minor The hasExistingNoSystemTables method in DatabaseDescriptor checks for directory only. On a new start, system KS would be created. This method current return true because of it, resulting incorrect/confusing log: logger.info(Found table data in data directories. Consider using the CLI to define your schema.) ; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5289) DatabaseDescriptor.hasExistingNoSystemTables return true even with only system keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-5289: - Attachment: 5289.diff patch to add check the name not equal to Table.SYSTEM_KS patch based on 1.1 branch note for trunk, Table.SYSTEM_KS is changed to Table.SYSTEM_TABLE DatabaseDescriptor.hasExistingNoSystemTables return true even with only system keyspace --- Key: CASSANDRA-5289 URL: https://issues.apache.org/jira/browse/CASSANDRA-5289 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Priority: Minor Attachments: 5289.diff The hasExistingNoSystemTables method in DatabaseDescriptor checks for directory only. On a new start, system KS would be created. This method current return true because of it, resulting incorrect/confusing log: logger.info(Found table data in data directories. Consider using the CLI to define your schema.) ; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4191) Add `nodetool cfstats ks cf` abilities
[ https://issues.apache.org/jira/browse/CASSANDRA-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509986#comment-13509986 ] Jackson Chung commented on CASSANDRA-4191: -- the 2nd index option would be nice :) Add `nodetool cfstats ks cf` abilities -- Key: CASSANDRA-4191 URL: https://issues.apache.org/jira/browse/CASSANDRA-4191 Project: Cassandra Issue Type: New Feature Affects Versions: 1.2.0 beta 1 Reporter: Joaquin Casares Assignee: Dave Brosius Priority: Minor Labels: datastax_qa Attachments: 4191_specific_cfstats.diff This way cfstats will only print information per keyspace/column family combinations. Another related proposal as an alternative to this ticket: Allow for `nodetool cfstats` to use --excludes or --includes to accept keyspace and column family arguments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4740) Phantom TCP connections, failing hinted handoff
[ https://issues.apache.org/jira/browse/CASSANDRA-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504272#comment-13504272 ] Jackson Chung commented on CASSANDRA-4740: -- we ran into the issue without having the stacksize problem (since our jvm is 1.6.0_32 and the stacksize problem is = _32) Phantom TCP connections, failing hinted handoff --- Key: CASSANDRA-4740 URL: https://issues.apache.org/jira/browse/CASSANDRA-4740 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Linux 3.4.9, java 1.6.0_35-b10 Reporter: Mina Naguib Priority: Minor Labels: connection, handoff, hinted, orphan, phantom, tcp, zombie Attachments: write_latency.png IP addresses in report anonymized: Had a server running cassandra (1.1.1.10) reboot ungracefully. Reboot and startup was successful and uneventful. cassandra went back into service ok. From that point onwards however, several (but not all) machines in the cassandra cluster started having difficulty with hinted handoff to that machine. This was despite nodetool ring showing Up across the board. Here's an example of an attempt, every 10 minutes, by a node (1.1.1.11) to replay hints to the node that was rebooted: {code} INFO [HintedHandoff:1] 2012-10-01 11:07:23,293 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:07:33,295 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:07:33,295 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:17:23,312 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:17:33,319 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:17:33,319 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:27:23,335 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:27:33,337 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:27:33,337 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:37:23,357 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:37:33,358 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:37:33,359 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:47:23,412 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:47:33,414 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:47:33,414 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 {code} I started poking around, and discovered that several nodes held ESTABLISHED TCP connections that didn't have a live endpoint on the rebooted node. My guess is they were live prior to the reboot, and after the reboot the nodes still see them as live and unsuccessfully try to use them. Example, on the node that was rebooted: {code} .10 ~ # netstat -tn | grep 1.1.1.11 tcp0 0 1.1.1.10:70001.1.1.11:40960ESTABLISHED tcp0 0 1.1.1.10:34370 1.1.1.11:7000 ESTABLISHED tcp0 0 1.1.1.10:45518 1.1.1.11:7000 ESTABLISHED {code} While on the node that's failing to hint to it: {code} .11 ~ # netstat -tn | grep 1.1.1.10 tcp0 0 1.1.1.11:7000 1.1.1.10:34370 ESTABLISHED tcp0 0 1.1.1.11:7000 1.1.1.10:45518 ESTABLISHED tcp0 0 1.1.1.11:7000 1.1.1.10:53316 ESTABLISHED tcp0 0 1.1.1.11:7000 1.1.1.10:43239 ESTABLISHED tcp0 0 1.1.1.11:409601.1.1.10:7000ESTABLISHED {code} Notice the
[jira] [Commented] (CASSANDRA-4740) Phantom TCP connections, failing hinted handoff
[ https://issues.apache.org/jira/browse/CASSANDRA-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502535#comment-13502535 ] Jackson Chung commented on CASSANDRA-4740: -- we too see a similar thing: On box 192.168.13.56 , looking for all ESTABLISHED connection for others connecting to this:7000 {panel} $ netstat -ant | grep 192.168.13.56:7000.*EST | cut -d ':' -f 1-2 | sort | uniq -c 1 tcp0 0 192.168.13.56:7000 192.168.12.13 2 tcp0 0 192.168.13.56:7000 192.168.14.145 217 tcp0 0 192.168.13.56:7000 192.168.44.237 202 tcp0 0 192.168.13.56:7000 192.168.45.67 198 tcp0 0 192.168.13.56:7000 192.168.46.141 11 tcp0 0 192.168.13.56:7000 192.168.76.156 10 tcp0 0 192.168.13.56:7000 192.168.77.72 11 tcp0 0 192.168.13.56:7000 192.168.78.153 {panel} On 192.168.44.237 , it just shows 1 ESTABLISHED to 192.168.13.56:7000: {panel} $ sudo netstat -antp | grep 192.168.44.237.*192.168.13.56:7000 tcp0 0 192.168.44.237:35252192.168.13.56:7000 ESTABLISHED 14398/java {panel} We too have HH problem similar to the above (though I don't see in the logs on the above 2 nodes that the timedout happen to these 2 nodes). We also have nodes flapping. And it also turned out the firewall rule wasn't opened on some nodes to communicate to all nodes on port 7000. restarting the node fix the issue. version: {panel} uname -a Linux kca06apigee 3.2.21-1.32.6.amzn1.x86_64 #1 SMP Sat Jun 23 02:32:15 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux $ /usr/java/latest/bin/java -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) {panel} How does netstat on 1 box shows 200+ ESTABLISHED conn to the other box while the other box only show 1 Phantom TCP connections, failing hinted handoff --- Key: CASSANDRA-4740 URL: https://issues.apache.org/jira/browse/CASSANDRA-4740 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.2 Environment: Linux 3.4.9, java 1.6.0_35-b10 Reporter: Mina Naguib Priority: Minor Labels: connection, handoff, hinted, orphan, phantom, tcp, zombie Attachments: write_latency.png IP addresses in report anonymized: Had a server running cassandra (1.1.1.10) reboot ungracefully. Reboot and startup was successful and uneventful. cassandra went back into service ok. From that point onwards however, several (but not all) machines in the cassandra cluster started having difficulty with hinted handoff to that machine. This was despite nodetool ring showing Up across the board. Here's an example of an attempt, every 10 minutes, by a node (1.1.1.11) to replay hints to the node that was rebooted: {code} INFO [HintedHandoff:1] 2012-10-01 11:07:23,293 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:07:33,295 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:07:33,295 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:17:23,312 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:17:33,319 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:17:33,319 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:27:23,335 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:27:33,337 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:27:33,337 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:37:23,357 HintedHandOffManager.java (line 294) Started hinted handoff for token: 122879743610338889583996386017027409691 with IP: /1.1.1.10 INFO [HintedHandoff:1] 2012-10-01 11:37:33,358 HintedHandOffManager.java (line 372) Timed out replaying hints to /1.1.1.10; aborting further deliveries INFO [HintedHandoff:1] 2012-10-01 11:37:33,359
[jira] [Commented] (CASSANDRA-4675) NPE in NTS when using LQ against a node (DC) that doesn't have replica
[ https://issues.apache.org/jira/browse/CASSANDRA-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467247#comment-13467247 ] Jackson Chung commented on CASSANDRA-4675: -- +1 thx NPE in NTS when using LQ against a node (DC) that doesn't have replica -- Key: CASSANDRA-4675 URL: https://issues.apache.org/jira/browse/CASSANDRA-4675 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Reporter: Jackson Chung Assignee: Jonathan Ellis Priority: Minor Fix For: 1.1.6 Attachments: 4675.txt in a NetworkTopologyStrategy where there are 2 DC: {panel} Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 127.0.0.1 dc1 r1 Up Normal 115.78 KB 50.00% 0 127.0.0.2 dc2 r1 Up Normal 129.3 KB50.00% 85070591730234615865843651857942052864 {panel} I have a KS that has replica is 1 of the dc (dc1): {panel} [default@unknown] describe Keyspace3; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc1:1] Column Families: ColumnFamily: testcf {panel} But if I connect to a node in dc2, using LOCAL_QUORUM, I get NPE in the Cassandra node's log: {panel} [default@unknown] consistencylevel as LOCAL_QUORUM; Consistency level is set to 'LOCAL_QUORUM'. [default@unknown] use Keyspace3; Authenticated to keyspace: Keyspace3 [default@Keyspace3] get testcf[utf8('k1')][utf8('c1')]; Internal error processing get org.apache.thrift.TApplicationException: Internal error processing get at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get(Cassandra.java:511) at org.apache.cassandra.thrift.Cassandra$Client.get(Cassandra.java:492) at org.apache.cassandra.cli.CliClient.executeGet(CliClient.java:648) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:209) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220) at org.apache.cassandra.cli.CliMain.main(CliMain.java:348) {panel} node2's log: {panel} ERROR [Thrift:3] 2012-09-17 18:15:16,868 Cassandra.java (line 2999) Internal error processing get java.lang.NullPointerException at org.apache.cassandra.locator.NetworkTopologyStrategy.getReplicationFactor(NetworkTopologyStrategy.java:142) at org.apache.cassandra.service.DatacenterReadCallback.determineBlockFor(DatacenterReadCallback.java:90) at org.apache.cassandra.service.ReadCallback.init(ReadCallback.java:67) at org.apache.cassandra.service.DatacenterReadCallback.init(DatacenterReadCallback.java:63) at org.apache.cassandra.service.StorageProxy.getReadCallback(StorageProxy.java:775) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:609) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:564) at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:128) at org.apache.cassandra.thrift.CassandraServer.internal_get(CassandraServer.java:383) at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:401) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2989) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) {panel} I could workaround it by adding dc2:0 to the option: {panel} [default@Keyspace3] describe Keyspace3; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc2:0, dc1:1] Column Families: ColumnFamily: testcf {panel} Now you get UA: {panel} [default@Keyspace3] get
[jira] [Commented] (CASSANDRA-4675) NPE in NTS when using LQ against a node (DC) that doesn't have replica
[ https://issues.apache.org/jira/browse/CASSANDRA-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13458244#comment-13458244 ] Jackson Chung commented on CASSANDRA-4675: -- 1.0.10 NPE in NTS when using LQ against a node (DC) that doesn't have replica -- Key: CASSANDRA-4675 URL: https://issues.apache.org/jira/browse/CASSANDRA-4675 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Priority: Minor in a NetworkTopologyStrategy where there are 2 DC: {panel} Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 127.0.0.1 dc1 r1 Up Normal 115.78 KB 50.00% 0 127.0.0.2 dc2 r1 Up Normal 129.3 KB50.00% 85070591730234615865843651857942052864 {panel} I have a KS that has replica is 1 of the dc (dc1): {panel} [default@unknown] describe Keyspace3; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc1:1] Column Families: ColumnFamily: testcf {panel} But if I connect to a node in dc2, using LOCAL_QUORUM, I get NPE in the Cassandra node's log: {panel} [default@unknown] consistencylevel as LOCAL_QUORUM; Consistency level is set to 'LOCAL_QUORUM'. [default@unknown] use Keyspace3; Authenticated to keyspace: Keyspace3 [default@Keyspace3] get testcf[utf8('k1')][utf8('c1')]; Internal error processing get org.apache.thrift.TApplicationException: Internal error processing get at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get(Cassandra.java:511) at org.apache.cassandra.thrift.Cassandra$Client.get(Cassandra.java:492) at org.apache.cassandra.cli.CliClient.executeGet(CliClient.java:648) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:209) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220) at org.apache.cassandra.cli.CliMain.main(CliMain.java:348) {panel} node2's log: {panel} ERROR [Thrift:3] 2012-09-17 18:15:16,868 Cassandra.java (line 2999) Internal error processing get java.lang.NullPointerException at org.apache.cassandra.locator.NetworkTopologyStrategy.getReplicationFactor(NetworkTopologyStrategy.java:142) at org.apache.cassandra.service.DatacenterReadCallback.determineBlockFor(DatacenterReadCallback.java:90) at org.apache.cassandra.service.ReadCallback.init(ReadCallback.java:67) at org.apache.cassandra.service.DatacenterReadCallback.init(DatacenterReadCallback.java:63) at org.apache.cassandra.service.StorageProxy.getReadCallback(StorageProxy.java:775) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:609) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:564) at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:128) at org.apache.cassandra.thrift.CassandraServer.internal_get(CassandraServer.java:383) at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:401) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2989) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) {panel} I could workaround it by adding dc2:0 to the option: {panel} [default@Keyspace3] describe Keyspace3; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc2:0, dc1:1] Column Families: ColumnFamily: testcf {panel} Now you get UA: {panel} [default@Keyspace3] get testcf[utf8('k1')][utf8('c1')]; null UnavailableException() at
[jira] [Created] (CASSANDRA-4675) NPE in NTS when using LQ against a node (DC) that doesn't have replica
Jackson Chung created CASSANDRA-4675: Summary: NPE in NTS when using LQ against a node (DC) that doesn't have replica Key: CASSANDRA-4675 URL: https://issues.apache.org/jira/browse/CASSANDRA-4675 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Priority: Minor in a NetworkTopologyStrategy where there are 2 DC: {panel} Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 127.0.0.1 dc1 r1 Up Normal 115.78 KB 50.00% 0 127.0.0.2 dc2 r1 Up Normal 129.3 KB50.00% 85070591730234615865843651857942052864 {panel} I have a KS that has replica is 1 of the dc (dc1): {panel} [default@unknown] describe Keyspace3; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc1:1] Column Families: ColumnFamily: testcf {panel} But if I connect to a node in dc2, using LOCAL_QUORUM, I get NPE in the Cassandra node's log: {panel} [default@unknown] consistencylevel as LOCAL_QUORUM; Consistency level is set to 'LOCAL_QUORUM'. [default@unknown] use Keyspace3; Authenticated to keyspace: Keyspace3 [default@Keyspace3] get testcf[utf8('k1')][utf8('c1')]; Internal error processing get org.apache.thrift.TApplicationException: Internal error processing get at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get(Cassandra.java:511) at org.apache.cassandra.thrift.Cassandra$Client.get(Cassandra.java:492) at org.apache.cassandra.cli.CliClient.executeGet(CliClient.java:648) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:209) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220) at org.apache.cassandra.cli.CliMain.main(CliMain.java:348) {panel} node2's log: {panel} ERROR [Thrift:3] 2012-09-17 18:15:16,868 Cassandra.java (line 2999) Internal error processing get java.lang.NullPointerException at org.apache.cassandra.locator.NetworkTopologyStrategy.getReplicationFactor(NetworkTopologyStrategy.java:142) at org.apache.cassandra.service.DatacenterReadCallback.determineBlockFor(DatacenterReadCallback.java:90) at org.apache.cassandra.service.ReadCallback.init(ReadCallback.java:67) at org.apache.cassandra.service.DatacenterReadCallback.init(DatacenterReadCallback.java:63) at org.apache.cassandra.service.StorageProxy.getReadCallback(StorageProxy.java:775) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:609) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:564) at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:128) at org.apache.cassandra.thrift.CassandraServer.internal_get(CassandraServer.java:383) at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:401) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2989) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) {panel} I could workaround it by adding dc2:0 to the option: {panel} [default@Keyspace3] describe Keyspace3; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc2:0, dc1:1] Column Families: ColumnFamily: testcf {panel} Now you get UA: {panel} [default@Keyspace3] get testcf[utf8('k1')][utf8('c1')]; null UnavailableException() at org.apache.cassandra.thrift.Cassandra$get_result.read(Cassandra.java:6506) at org.apache.cassandra.thrift.Cassandra$Client.recv_get(Cassandra.java:519) at org.apache.cassandra.thrift.Cassandra$Client.get(Cassandra.java:492) at org.apache.cassandra.cli.CliClient.executeGet(CliClient.java:648) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:209)
[jira] [Created] (CASSANDRA-4423) auto completion in cqlsh should work when using fully qualified name
Jackson Chung created CASSANDRA-4423: Summary: auto completion in cqlsh should work when using fully qualified name Key: CASSANDRA-4423 URL: https://issues.apache.org/jira/browse/CASSANDRA-4423 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor minor cqlsh improvement: the auto completion in cqlsh rocks, so this is just to make a nitpick improvement: if i type {panel} cqlsh create KEYSPACE test WITH strategy_class = 'SimpleStrategy' and {panel} then tab tab after it, it will auto complete into: {panel} cqlsh create KEYSPACE test WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = {panel} but if i use a fully qualified name: {panel} cqlsh create KEYSPACE test WITH strategy_class = 'org.apache.cassandra.locator.SimpleStrategy' AND strategy_option_name {panel} it is not smart enough to figured out the available options. It'd be nice to make the auto completion works for those fully qualified cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4255) concurrent modif ex when repair is run on LCS
Jackson Chung created CASSANDRA-4255: Summary: concurrent modif ex when repair is run on LCS Key: CASSANDRA-4255 URL: https://issues.apache.org/jira/browse/CASSANDRA-4255 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung came across this, will try to figure a way to systematically reprod this. But the problem is the sstable list in the manifest is changing as the repair is triggered: {panel} Exception in thread main java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(Unknown Source) at java.util.AbstractList$Itr.next(Unknown Source) at org.apache.cassandra.io.sstable.SSTable.getTotalBytes(SSTable.java:250) at org.apache.cassandra.db.compaction.LeveledManifest.getEstimatedTasks(LeveledManifest.java:435) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getEstimatedRemainingTasks(LeveledCompactionStrategy.java:128) at org.apache.cassandra.db.compaction.CompactionManager.getPendingTasks(CompactionManager.java:1063) at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(Unknown Source) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(Unknown Source) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(Unknown Source) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.access$200(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(Unknown Source) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source) at sun.rmi.transport.Transport$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Unknown Source) at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {panel} maybe we could change the list to a copyOnArrayList? just a suggestion, haven't investigated much yet: {code:title=LeveledManifest.java} generations[i] = new ArrayListSSTableReader(); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4204) Pig does not work on DateType
[ https://issues.apache.org/jira/browse/CASSANDRA-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-4204: - Attachment: pig_1335816404547.log Pig does not work on DateType - Key: CASSANDRA-4204 URL: https://issues.apache.org/jira/browse/CASSANDRA-4204 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Attachments: pig_1335816404547.log cqlsh:PigDemo describe columnfamily test1897; CREATE COLUMNFAMILY test1897 ( KEY text PRIMARY KEY, testcol timestamp ) WITH comment='' AND comparator=text AND row_cache_provider='SerializingCacheProvider' AND key_cache_size=20.00 AND row_cache_size=0.00 AND read_repair_chance=1.00 AND gc_grace_seconds=864000 AND default_validation=blob AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND row_cache_save_period_in_seconds=0 AND key_cache_save_period_in_seconds=14400 AND replicate_on_write=True; cqlsh:PigDemo select * from test1897; KEY | testcol -+- akey | 2012-01-21 00:14:12+ $ cat test1897.pig cassandra_data = LOAD 'cassandra://PigDemo/test1897' USING CassandraStorage() AS (name, columns: bag {T: tuple()}); dump cassandra_data; there seems problem with the DateType. the above simple pig script fail with the attached err -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4204) Pig does not work on DateType
Jackson Chung created CASSANDRA-4204: Summary: Pig does not work on DateType Key: CASSANDRA-4204 URL: https://issues.apache.org/jira/browse/CASSANDRA-4204 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Attachments: pig_1335816404547.log cqlsh:PigDemo describe columnfamily test1897; CREATE COLUMNFAMILY test1897 ( KEY text PRIMARY KEY, testcol timestamp ) WITH comment='' AND comparator=text AND row_cache_provider='SerializingCacheProvider' AND key_cache_size=20.00 AND row_cache_size=0.00 AND read_repair_chance=1.00 AND gc_grace_seconds=864000 AND default_validation=blob AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND row_cache_save_period_in_seconds=0 AND key_cache_save_period_in_seconds=14400 AND replicate_on_write=True; cqlsh:PigDemo select * from test1897; KEY | testcol -+- akey | 2012-01-21 00:14:12+ $ cat test1897.pig cassandra_data = LOAD 'cassandra://PigDemo/test1897' USING CassandraStorage() AS (name, columns: bag {T: tuple()}); dump cassandra_data; there seems problem with the DateType. the above simple pig script fail with the attached err -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4193) cql delete does not delete
Jackson Chung created CASSANDRA-4193: Summary: cql delete does not delete Key: CASSANDRA-4193 URL: https://issues.apache.org/jira/browse/CASSANDRA-4193 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung tested in 1.1 and trunk branch on a single node: {panel} cqlsh:test create table testcf_old ( username varchar , id int , name varchar , stuff varchar, primary key(username,id,name)) with compact storage; cqlsh:test insert into testcf_old ( username , id , name , stuff ) values ('abc', 2, 'rst', 'some other bunch of craps'); cqlsh:test select * from testcf_old; username | id | name | stuff --++--+--- abc | 2 | rst | some other bunch of craps abc | 4 | xyz | a bunch of craps cqlsh:test delete from testcf_old where username = 'abc' and id =2; cqlsh:test select * from testcf_old; username | id | name | stuff --++--+--- abc | 2 | rst | some other bunch of craps abc | 4 | xyz | a bunch of craps {panel} same also when not using compact: {panel} cqlsh:test create table testcf ( username varchar , id int , name varchar , stuff varchar, primary key(username,id)); cqlsh:test select * from testcf; username | id | name | stuff --++---+-- abc | 2 | some other bunch of craps | rst abc | 4 | xyz | a bunch of craps cqlsh:test delete from testcf where username = 'abc' and id =2; cqlsh:test select * from testcf; username | id | name | stuff --++---+-- abc | 2 | some other bunch of craps | rst abc | 4 | xyz | a bunch of craps {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3220) add describe_ring to cli
[ https://issues.apache.org/jira/browse/CASSANDRA-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108050#comment-13108050 ] Jackson Chung commented on CASSANDRA-3220: -- i agree on all the points. I could use a help on what should be expected though. For instance, when use keyspace is issued without keyspace name, you get the similar output: {noformat} [default@Keyspace3] use keyspace; Syntax error at position 4: mismatched input 'keyspace' expecting set null {noformat} The stacktrace is: java.lang.RuntimeException: Syntax error at position 4: mismatched input 'keyspace' expecting set null at org.apache.cassandra.cli.CliCompiler.compileQuery(CliCompiler.java:88) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:197) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:222) at org.apache.cassandra.cli.CliMain.main(CliMain.java:350) Caused by: java.lang.RuntimeException: Syntax error at position 4: mismatched input 'keyspace' expecting set null at org.apache.cassandra.cli.CliParser.reportError(CliParser.java:197) at org.apache.cassandra.cli.CliParser.entityName(CliParser.java:7745) at org.apache.cassandra.cli.CliParser.keyspace(CliParser.java:7259) at org.apache.cassandra.cli.CliParser.useKeyspace(CliParser.java:5713) at org.apache.cassandra.cli.CliParser.statement(CliParser.java:528) at org.apache.cassandra.cli.CliParser.root(CliParser.java:229) at org.apache.cassandra.cli.CliCompiler.compileQuery(CliCompiler.java:79) In any case. For this, I think I could use the describe keyspace pattern, which in the case where no keyspace is given, it ask you to use one first; and then if you are authenticated to one, and run describe ring without the keyspace name, describe the current one. Agree? {noformat} [default@unknown] describe; Authenticate to a Keyspace, before using `describe` or `describe column_family` [default@unknown] use Keyspace3; Authenticated to keyspace: Keyspace3 [default@Keyspace3] describe; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:4] Column Families: ColumnFamily: testcf Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds / keys to save : 0.0/0/all Key cache size / save period in seconds: 100.0/14400 GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy {noformat} agree? For using system, it error out because: [default@system] describe ring system; null InvalidRequestException(why:There is no ring for the keyspace: system) at org.apache.cassandra.thrift.Cassandra$describe_ring_result.read(Cassandra.java:23267) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_ring(Cassandra.java:1262) at org.apache.cassandra.thrift.Cassandra$Client.describe_ring(Cassandra.java:1237) at org.apache.cassandra.cli.CliClient.executeDescribeRing(CliClient.java:1437) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:288) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:222) at org.apache.cassandra.cli.CliMain.main(CliMain.java:350) It's probably because this keyspace uses internal strategy org.apache.cassandra.locator.LocalStrategy? In any case, exception is thrown because in the CassandraServer.java: {code} public ListTokenRange describe_ring(String keyspace)throws InvalidRequestException { if (keyspace == null || !Schema.instance.getNonSystemTables().contains(keyspace)) throw new InvalidRequestException(There is no ring for the keyspace: + keyspace); {code} So my option in the cli are either: 1) do what CassandraServer does: invalidate it if the keyspace given is a system table, make some pretty error 2) catch the invalidrequest exception, check if it is about There is no ring... in the message, and error out with some pretty print. (i don't like the 2nd option because the message is currently hardcoded within the CassandraServer. There is no enum or constant for message string to reference on. I don't think the datacenter=null is part of the problem here. It looks to be another bug with the datacenter value not populated to the EndpointDetails. If anything, it is a bug from https://issues.apache.org/jira/browse/CASSANDRA-2882 ? add describe_ring to cli
[jira] [Issue Comment Edited] (CASSANDRA-3220) add describe_ring to cli
[ https://issues.apache.org/jira/browse/CASSANDRA-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108050#comment-13108050 ] Jackson Chung edited comment on CASSANDRA-3220 at 9/19/11 6:49 PM: --- i agree on all the points. I could use a help on what should be expected though. For instance, when use keyspace is issued without keyspace name, you get the similar output: {noformat} [default@Keyspace3] use keyspace; Syntax error at position 4: mismatched input 'keyspace' expecting set null {noformat} The stacktrace is: java.lang.RuntimeException: Syntax error at position 4: mismatched input 'keyspace' expecting set null at org.apache.cassandra.cli.CliCompiler.compileQuery(CliCompiler.java:88) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:197) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:222) at org.apache.cassandra.cli.CliMain.main(CliMain.java:350) Caused by: java.lang.RuntimeException: Syntax error at position 4: mismatched input 'keyspace' expecting set null at org.apache.cassandra.cli.CliParser.reportError(CliParser.java:197) at org.apache.cassandra.cli.CliParser.entityName(CliParser.java:7745) at org.apache.cassandra.cli.CliParser.keyspace(CliParser.java:7259) at org.apache.cassandra.cli.CliParser.useKeyspace(CliParser.java:5713) at org.apache.cassandra.cli.CliParser.statement(CliParser.java:528) at org.apache.cassandra.cli.CliParser.root(CliParser.java:229) at org.apache.cassandra.cli.CliCompiler.compileQuery(CliCompiler.java:79) In any case. For this, I think I could use the describe keyspace pattern, which in the case where no keyspace is given, it ask you to use one first; and then if you are authenticated to one, and run describe ring without the keyspace name, describe the current one. Agree? {noformat} [default@unknown] describe; Authenticate to a Keyspace, before using `describe` or `describe column_family` [default@unknown] use Keyspace3; Authenticated to keyspace: Keyspace3 [default@Keyspace3] describe; Keyspace: Keyspace3: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:4] Column Families: ColumnFamily: testcf Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds / keys to save : 0.0/0/all Key cache size / save period in seconds: 100.0/14400 GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy {noformat} For using system, it error out because: [default@system] describe ring system; null InvalidRequestException(why:There is no ring for the keyspace: system) at org.apache.cassandra.thrift.Cassandra$describe_ring_result.read(Cassandra.java:23267) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_ring(Cassandra.java:1262) at org.apache.cassandra.thrift.Cassandra$Client.describe_ring(Cassandra.java:1237) at org.apache.cassandra.cli.CliClient.executeDescribeRing(CliClient.java:1437) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:288) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:222) at org.apache.cassandra.cli.CliMain.main(CliMain.java:350) It's probably because this keyspace uses internal strategy org.apache.cassandra.locator.LocalStrategy? In any case, exception is thrown because in the CassandraServer.java: {code} public ListTokenRange describe_ring(String keyspace)throws InvalidRequestException { if (keyspace == null || !Schema.instance.getNonSystemTables().contains(keyspace)) throw new InvalidRequestException(There is no ring for the keyspace: + keyspace); {code} So my option in the cli are either: 1) do what CassandraServer does: invalidate it if the keyspace given is a system table, make some pretty error 2) catch the invalidrequest exception, check if it is about There is no ring... in the message, and error out with some pretty print. (i don't like the 2nd option because the message is currently hardcoded within the CassandraServer. There is no enum or constant for message string to reference on. I don't think the datacenter=null is part of the problem here. It looks to be another bug with the datacenter value not populated to the EndpointDetails. If anything, it is a bug from https://issues.apache.org/jira/browse/CASSANDRA-2882 ? was (Author:
[jira] [Created] (CASSANDRA-3220) add describe_ring to cli
add describe_ring to cli Key: CASSANDRA-3220 URL: https://issues.apache.org/jira/browse/CASSANDRA-3220 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Lately I have found the describe_ring feature was needed to debug/analyze issue, but the cli does not have this available. So just in case it is useful, please see the attached patch. here is the sample output: {noformat} [default@unknown] help; ... ... decrDecrements a counter column. describe ring Describe the token range information. describe clusterDescribe the cluster configuration. ... ... [default@unknown] help describe ring; describe ring keyspace; Describes the token range settings for the named keyspace. Required Parameters: - keyspace: Name of the keyspace to describe the token range. Examples: describe ring keyspace; - Describes the token range settings for the named keyspace. [default@unknown] describe ring Keyspace3; TokenRange: TokenRange(start_token:9739248273232290250409572410247679660, end_token:9739248273232290250409572410247679660, endpoints:[192.168.0.125], rpc_endpoints:[192.168.0.125], endpoint_details:[EndpointDetails(host:192.168.0.125, port:9160, datacenter:168)]) [default@unknown] describe ring fooks; Keyspace with name 'fooks' wasn't found, , please, authorize to one of the keyspaces first. [default@unknown] describe ring; Syntax error at position 13: mismatched input ';' expecting set null {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3220) add describe_ring to cli
[ https://issues.apache.org/jira/browse/CASSANDRA-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-3220: - Attachment: patch3220.diff --- src/java/org/apache/cassandra/cli/CliClient.java(revision 1171325) --- src/java/org/apache/cassandra/cli/Cli.g (revision 1171325) --- src/resources/org/apache/cassandra/cli/CliHelp.yaml (revision 1171325) add describe_ring to cli Key: CASSANDRA-3220 URL: https://issues.apache.org/jira/browse/CASSANDRA-3220 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: patch3220.diff Lately I have found the describe_ring feature was needed to debug/analyze issue, but the cli does not have this available. So just in case it is useful, please see the attached patch. here is the sample output: {noformat} [default@unknown] help; ... ... decrDecrements a counter column. describe ring Describe the token range information. describe clusterDescribe the cluster configuration. ... ... [default@unknown] help describe ring; describe ring keyspace; Describes the token range settings for the named keyspace. Required Parameters: - keyspace: Name of the keyspace to describe the token range. Examples: describe ring keyspace; - Describes the token range settings for the named keyspace. [default@unknown] describe ring Keyspace3; TokenRange: TokenRange(start_token:9739248273232290250409572410247679660, end_token:9739248273232290250409572410247679660, endpoints:[192.168.0.125], rpc_endpoints:[192.168.0.125], endpoint_details:[EndpointDetails(host:192.168.0.125, port:9160, datacenter:168)]) [default@unknown] describe ring fooks; Keyspace with name 'fooks' wasn't found, , please, authorize to one of the keyspaces first. [default@unknown] describe ring; Syntax error at position 13: mismatched input ';' expecting set null {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3220) add describe_ring to cli
[ https://issues.apache.org/jira/browse/CASSANDRA-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106664#comment-13106664 ] Jackson Chung commented on CASSANDRA-3220: -- more sample... i didn't actually test the authentication keyspace and actually on a cluster, but they should work. {noformat} [default@unknown] describe ring system; null InvalidRequestException(why:There is no ring for the keyspace: system) at org.apache.cassandra.thrift.Cassandra$describe_ring_result.read(Cassandra.java:23267) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_ring(Cassandra.java:1262) at org.apache.cassandra.thrift.Cassandra$Client.describe_ring(Cassandra.java:1237) at org.apache.cassandra.cli.CliClient.executeDescribeRing(CliClient.java:1437) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:288) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:222) at org.apache.cassandra.cli.CliMain.main(CliMain.java:350) {noformat} add describe_ring to cli Key: CASSANDRA-3220 URL: https://issues.apache.org/jira/browse/CASSANDRA-3220 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: patch3220.diff Lately I have found the describe_ring feature was needed to debug/analyze issue, but the cli does not have this available. So just in case it is useful, please see the attached patch. here is the sample output: {noformat} [default@unknown] help; ... ... decrDecrements a counter column. describe ring Describe the token range information. describe clusterDescribe the cluster configuration. ... ... [default@unknown] help describe ring; describe ring keyspace; Describes the token range settings for the named keyspace. Required Parameters: - keyspace: Name of the keyspace to describe the token range. Examples: describe ring keyspace; - Describes the token range settings for the named keyspace. [default@unknown] describe ring Keyspace3; TokenRange: TokenRange(start_token:9739248273232290250409572410247679660, end_token:9739248273232290250409572410247679660, endpoints:[192.168.0.125], rpc_endpoints:[192.168.0.125], endpoint_details:[EndpointDetails(host:192.168.0.125, port:9160, datacenter:168)]) [default@unknown] describe ring fooks; Keyspace with name 'fooks' wasn't found, , please, authorize to one of the keyspaces first. [default@unknown] describe ring; Syntax error at position 13: mismatched input ';' expecting set null {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3220) add describe_ring to cli
[ https://issues.apache.org/jira/browse/CASSANDRA-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106758#comment-13106758 ] Jackson Chung commented on CASSANDRA-3220: -- hm, i was running into issue reported in #3044 and #2388 {noformat} 2011-09-14 22:58:44,004 ERROR CliDriver (SessionState.java:printError(343)) - Failed with exception java.io.IOException:java.io.IOException: Could not get input splits java.io.IOException: java.io.IOException: Could not get input splits at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:341) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:133) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:157) at org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat.getSplits(HiveCassandraStandardColumnInputFormat.java:326) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:281) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:320) ... 10 more Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:153) ... 13 more Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:234) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:70) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:190) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:175) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} The failed connecting to all endpoints was actually supposedly printing the endpoints: throw new IOException(failed connecting to all endpoints + StringUtils.join(range.endpoints, ,)); Since the stacktrace doesn't print anything after the endpoints, that's when i learned that my range.endpoints are null, and hence prompted me to find out what my range is. The reason i choose to do it in the cli as I was thinking on the how the code internally has been getting it and see mostly using the thrift API. I could see if that could be done in nodetool, but could you explain why it is better? add describe_ring to cli Key: CASSANDRA-3220 URL: https://issues.apache.org/jira/browse/CASSANDRA-3220 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: patch3220.diff Lately I have found the describe_ring feature was needed to debug/analyze issue, but the cli does not have this available. So just in case it is useful, please see the attached patch. here is the sample output: {noformat} [default@unknown] help; ... ... decrDecrements a counter column. describe ring Describe the token range information. describe clusterDescribe the cluster configuration. ... ... [default@unknown] help describe ring; describe ring keyspace; Describes the token range settings for the named keyspace. Required Parameters: - keyspace: Name of the keyspace to describe the token range. Examples: describe ring keyspace; - Describes the token range settings for the named keyspace. [default@unknown] describe ring Keyspace3; TokenRange: TokenRange(start_token:9739248273232290250409572410247679660,
[jira] [Commented] (CASSANDRA-3114) After Choosing EC2Snitch you can't migrate off w/o a full cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105768#comment-13105768 ] Jackson Chung commented on CASSANDRA-3114: -- I don't see how making your dc/rack names your external IP address is going to solve anything. well the NPE was on {code} return Gossiper.instance.getEndpointStateForEndpoint(endpoint).getApplicationState(ApplicationState.DC).value; {code} the given endpoint is not the local address; its the address from other nodes. For those other nodes, if they are not using the Ec2Snitch, which would have populated the ApplicationState.DC and ApplicationState.RACK with the values, getApplicationState(ApplicationState.DC) (and ).getApplicationState(ApplicationState.RACK) for that matter) is going to be return null. Hence you got a NPE from that line on .value. Defaulting the AbstractEndpointSnitch's gossiperStarting by populating the ApplicationState.DC,ApplicationState.RACK wll help then any snitch relying the gossip info to getDC and getRack. After Choosing EC2Snitch you can't migrate off w/o a full cluster restart - Key: CASSANDRA-3114 URL: https://issues.apache.org/jira/browse/CASSANDRA-3114 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.8, 0.8.4 Reporter: Benjamin Coverston Once you choose the Ec2Snitch the gossip messages will trigger this exception if you try to move (for example) to the property file snitch: ERROR [pool-2-thread-11] 2011-08-30 16:38:06,935 Cassandra.java (line 3041) Internal error processing get_slice java.lang.NullPointerException at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:84) at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122) at org.apache.cassandra.service.DatacenterReadCallback.assureSufficientLiveNodes(DatacenterReadCallback.java:77) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:516) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:480) at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:109) at org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:263) at org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:345) at org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:306) at org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:3033) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3114) After Choosing EC2Snitch you can't migrate off w/o a full cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105768#comment-13105768 ] Jackson Chung edited comment on CASSANDRA-3114 at 9/15/11 11:32 PM: I don't see how making your dc/rack names your external IP address is going to solve anything. well the NPE was on {code} return Gossiper.instance.getEndpointStateForEndpoint(endpoint).getApplicationState(ApplicationState.DC).value; {code} the given endpoint is not the local address; its the address from other nodes. For those other nodes, if they are not using the Ec2Snitch, which would have populated the ApplicationState.DC and ApplicationState.RACK with the values, getApplicationState(ApplicationState.DC) (and getApplicationState(ApplicationState.RACK) for that matter) is going to be return null. Hence you got a NPE from that line on .value. Defaulting the AbstractEndpointSnitch's gossiperStarting by populating the ApplicationState.DC,ApplicationState.RACK wll help then any snitch relying the gossip info to getDC and getRack. was (Author: cywjackson): I don't see how making your dc/rack names your external IP address is going to solve anything. well the NPE was on {code} return Gossiper.instance.getEndpointStateForEndpoint(endpoint).getApplicationState(ApplicationState.DC).value; {code} the given endpoint is not the local address; its the address from other nodes. For those other nodes, if they are not using the Ec2Snitch, which would have populated the ApplicationState.DC and ApplicationState.RACK with the values, getApplicationState(ApplicationState.DC) (and ).getApplicationState(ApplicationState.RACK) for that matter) is going to be return null. Hence you got a NPE from that line on .value. Defaulting the AbstractEndpointSnitch's gossiperStarting by populating the ApplicationState.DC,ApplicationState.RACK wll help then any snitch relying the gossip info to getDC and getRack. After Choosing EC2Snitch you can't migrate off w/o a full cluster restart - Key: CASSANDRA-3114 URL: https://issues.apache.org/jira/browse/CASSANDRA-3114 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.8, 0.8.4 Reporter: Benjamin Coverston Once you choose the Ec2Snitch the gossip messages will trigger this exception if you try to move (for example) to the property file snitch: ERROR [pool-2-thread-11] 2011-08-30 16:38:06,935 Cassandra.java (line 3041) Internal error processing get_slice java.lang.NullPointerException at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:84) at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122) at org.apache.cassandra.service.DatacenterReadCallback.assureSufficientLiveNodes(DatacenterReadCallback.java:77) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:516) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:480) at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:109) at org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:263) at org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:345) at org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:306) at org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:3033) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3114) After Choosing EC2Snitch you can't migrate off w/o a full cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099219#comment-13099219 ] Jackson Chung commented on CASSANDRA-3114: -- What if do this in the Abstract? {code:title=AbstractEndpointSnitch.java} public void gossiperStarting() { String dc = getDatacenter(FBUtilities.getBroadcastAddress()); String rack = getRack(FBUtilities.getBroadcastAddress()); logger.info(this.getClass().getSimpleName() + adding ApplicationState DC= + dc + Rack= + rack); Gossiper.instance.addLocalApplicationState(ApplicationState.DC, StorageService.instance.valueFactory.datacenter(dc)); Gossiper.instance.addLocalApplicationState(ApplicationState.RACK, StorageService.instance.valueFactory.rack(rack)); } {code} After Choosing EC2Snitch you can't migrate off w/o a full cluster restart - Key: CASSANDRA-3114 URL: https://issues.apache.org/jira/browse/CASSANDRA-3114 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.8, 0.8.4 Reporter: Benjamin Coverston Once you choose the Ec2Snitch the gossip messages will trigger this exception if you try to move (for example) to the property file snitch: ERROR [pool-2-thread-11] 2011-08-30 16:38:06,935 Cassandra.java (line 3041) Internal error processing get_slice java.lang.NullPointerException at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:84) at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122) at org.apache.cassandra.service.DatacenterReadCallback.assureSufficientLiveNodes(DatacenterReadCallback.java:77) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:516) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:480) at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:109) at org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:263) at org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:345) at org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:306) at org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:3033) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3114) After Choosing EC2Snitch you can't migrate off w/o a full cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099226#comment-13099226 ] Jackson Chung commented on CASSANDRA-3114: -- but you'd still have to name things with the ec2snitch conventions for things to not break still hold true with the above. After Choosing EC2Snitch you can't migrate off w/o a full cluster restart - Key: CASSANDRA-3114 URL: https://issues.apache.org/jira/browse/CASSANDRA-3114 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.8, 0.8.4 Reporter: Benjamin Coverston Once you choose the Ec2Snitch the gossip messages will trigger this exception if you try to move (for example) to the property file snitch: ERROR [pool-2-thread-11] 2011-08-30 16:38:06,935 Cassandra.java (line 3041) Internal error processing get_slice java.lang.NullPointerException at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:84) at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122) at org.apache.cassandra.service.DatacenterReadCallback.assureSufficientLiveNodes(DatacenterReadCallback.java:77) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:516) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:480) at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:109) at org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:263) at org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:345) at org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:306) at org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:3033) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3057) secondary index on a column that has a value of size 64k will fail on flush
secondary index on a column that has a value of size 64k will fail on flush - Key: CASSANDRA-3057 URL: https://issues.apache.org/jira/browse/CASSANDRA-3057 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Priority: Minor exception seen on flush when an indexed column contain size 64k: granted that having a value 64k possibly mean something that shouldn't be indexed as it most likely would have a high cardinality, but i think there would still be some valid use case for it. test case: simply run the stress test with -n 1 -u 0 -c 2 -y Standard -o INSERT -S 65536 -x KEYS then call a flush exception: INFO [FlushWriter:8] 2011-08-18 21:49:33,214 Memtable.java (line 218) Writing Memtable-Standard1.Idx1@1652462853(16/20 serialized/live bytes, 1 ops) Standard1@980087547(196659/245823 serialized/live bytes, 3 ops) ERROR [FlushWriter:8] 2011-08-18 21:49:33,230 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[FlushWriter:8,5,RMI Runtime] java.lang.AssertionError: 65536 at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:330) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:245) at org.apache.cassandra.db.Memtable.access$400(Memtable.java:49) at org.apache.cassandra.db.Memtable$3.runMayThrow(Memtable.java:270) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2967) Only bind JMX to the same IP address that is being used in Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083430#comment-13083430 ] Jackson Chung commented on CASSANDRA-2967: -- i wouldn't say it is lhf, this is quite important in terms of network security. not adding value here, just a ref site similar to the above blog (its just from official Sun's page). http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html see Monitoring Applications through a Firewall section if tl;dr Only bind JMX to the same IP address that is being used in Cassandra Key: CASSANDRA-2967 URL: https://issues.apache.org/jira/browse/CASSANDRA-2967 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.8.2 Reporter: Joaquin Casares Priority: Minor Labels: lhf The setup is 5 nodes in each data center are all running on one physical test machine and even though the repair was run against the correct IP the wrong JMX port was used. As a result, instead of repairing all 5 nodes I was repairing the same node 5 times. It would be nice if Cassandra's JMX would bind to only the IP address on which its thrift/RPC services are listening on instead of binding to all IP's on the box. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2845) Cassandra uses 100% system CPU on Ubuntu Natty (11.04)
[ https://issues.apache.org/jira/browse/CASSANDRA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083455#comment-13083455 ] Jackson Chung commented on CASSANDRA-2845: -- fwiw, i was able to avoid this (hang) if using just java (Sun's) instead of jsvc. (jna enabled on both, i do have to symlink it manually when start manually if install from deb package) Once i switch to jsvc, hell breaks kernel was on the older one on ec2: Linux domU-12-31-39-00-2C-42 2.6.38-8-virtual #42-Ubuntu SMP Mon Apr 11 04:06:34 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux Also able to had the same hang on a 2.6.35 on a rackspace couple days ago (killed the vm already..) dmesg shows timeout/OOM, crazy stuff :) on my own local,with separate install kernel Distributor ID: Ubuntu Description:Ubuntu 10.10 Release:10.10 Codename: maverick Linux faranth 2.6.39-02063903-generic #201107091121 SMP Sat Jul 9 11:25:36 UTC 2011 x86_64 GNU/Linux I don't have the hang problem (using jsvc/jna/package) Cassandra uses 100% system CPU on Ubuntu Natty (11.04) -- Key: CASSANDRA-2845 URL: https://issues.apache.org/jira/browse/CASSANDRA-2845 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Environment: Default install of Ubuntu 11.04 Reporter: Steve Corona Assignee: paul cannon Priority: Critical Fix For: 0.8.2 Step 1. Boot up a brand new, default Ubuntu 11.04 Server install Step 2. Install Cassandra from Apache APT Respository (deb http://www.apache.org/dist/cassandra/debian 08x main) Step 3. apt-get install cassandra, as soon as it cassandra starts it will freeze the machine What's happening is that as soon as cassandra starts up it immediately sucks up 100% of CPU and starves the machine. This effectively bricks the box until you boot into single user mode and disable the cassandra init.d script. Under htop, the CPU usage shows up as system cpu, not user. The machine I'm testing this on is a Quad-Core Sandy Bridge w/ 16GB of Memory, so it's not a system resource issue. I've also tested this on completely different hardware (Dual 64-Bit Xeons AMD X4) and it has the same effect. Ubuntu 10.10 does not exhibit the same issue. I have only tested 0.8 and 0.8.1. root@cassandra01:/# java -version java version 1.6.0_22 OpenJDK Runtime Environment (IcedTea6 1.10.2) (6b22-1.10.2-0ubuntu1~11.04.1) OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode) root@cassandra:/# uname -a Linux cassandra01 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux /proc/cpu Intel(R) Xeon(R) CPU E31270 @ 3.40GHz /proc/meminfo MemTotal: 16459776 kB MemFree:14190708 kB -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2983) CliClient print memtable threshold in incorrect order
CliClient print memtable threshold in incorrect order - Key: CASSANDRA-2983 URL: https://issues.apache.org/jira/browse/CASSANDRA-2983 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jackson Chung Priority: Minor as a continuation from #2839 looks like it was incorrectly merged into 0.8 as well, hence affecting 0.8.2. for trunk, this is also changed (time is taken out). So I guess format wise, we would stick with the fixed format in 0.7.7 per #2839 , which is: {code} sessionState.out.printf( Memtable thresholds: %s/%s/%s (millions of ops/minutes/MB)%n, cf_def.memtable_operations_in_millions, cf_def.memtable_flush_after_mins, cf_def.memtable_throughput_in_mb); {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2971) Append (not add new) InetAddress info logging when starting MessagingService
Append (not add new) InetAddress info logging when starting MessagingService Key: CASSANDRA-2971 URL: https://issues.apache.org/jira/browse/CASSANDRA-2971 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: 2971.patch Currently we have {code: title=MessagingService.getServerSocket(InetAddress localEp) } logger_.info(Starting Messaging Service on port {}, DatabaseDescriptor.getStoragePort()); {code} We should probably just print the whole binded address. The address is an InetSocketAddress: {code} InetSocketAddress address = new InetSocketAddress(localEp, DatabaseDescriptor.getStoragePort()); try { ss.bind(address); } {code} {code} logger_.info(Starting Messaging Service on {},address); {code} sample output with the new log: {noformat} INFO [main] 2011-07-29 18:54:54,018 MessagingService.java (line 226) Starting Messaging Service on faranth/192.168.1.141:7000 {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2971) Append (not add new) InetAddress info logging when starting MessagingService
[ https://issues.apache.org/jira/browse/CASSANDRA-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-2971: - Attachment: 2971.patch Append (not add new) InetAddress info logging when starting MessagingService Key: CASSANDRA-2971 URL: https://issues.apache.org/jira/browse/CASSANDRA-2971 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: 2971.patch Currently we have {code: title=MessagingService.getServerSocket(InetAddress localEp) } logger_.info(Starting Messaging Service on port {}, DatabaseDescriptor.getStoragePort()); {code} We should probably just print the whole binded address. The address is an InetSocketAddress: {code} InetSocketAddress address = new InetSocketAddress(localEp, DatabaseDescriptor.getStoragePort()); try { ss.bind(address); } {code} {code} logger_.info(Starting Messaging Service on {},address); {code} sample output with the new log: {noformat} INFO [main] 2011-07-29 18:54:54,018 MessagingService.java (line 226) Starting Messaging Service on faranth/192.168.1.141:7000 {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2895) add java classpath to cassandra startup logging
[ https://issues.apache.org/jira/browse/CASSANDRA-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069676#comment-13069676 ] Jackson Chung commented on CASSANDRA-2895: -- sure. First off, I will start by saying it is absolutely possible to still obtain the classpath from other techniques, such as jconsole, jinfo, ps, /proc/pid/cmdline, etc. All these are doable providing using it properly. Some common typical challenges on these other techniques: PATH is difference for the user who starts cassandra vs user who ssh into the machine and check, resulting in cannot attach the jvm due to jvm mismatched. Cannot attach remotely due to permission. ps result truncated. /proc/pidcmdline result is ugly (no separator). We need to know the classpath info to identify which jars contains the classes. Granted knowing just the classpath is not enough as one would still need to verify that actual path/jar exists (and with proper permission). Down the road when hotfixes accumulated (and for the sake of arguments in a scenario where the hotfix only contains fixed classes, but not the whole build), we would need to identify if a hotfix (jar) is properly set in the classpath to validate the java process is running with the hotfix. a real (small) example: INFO [main] 2011-07-22 11:34:10,113 CLibrary.java (line 61) JNA not found. Native methods will be disabled. well from a end-user perspective, not found where? Of course a java-experienced user would most likely intuitively think of class path. But for others it may not be obvious. Further support analysts will need to do his/her own analysis to determine if the jna.jar is on the classpath and in the said-location. (ie jna.jar is in /path/to/ but classpath said /path/to/others/). This could be further challenged if it needs to be done remotely or the analyst has no direct access to the machine. Hence the classpath info is one of the more important info for someone who understand not just Cassandra, but java in general of what it is doing. As to oppose using DEBUG: currently the logging does not have a header sets. Ie the log is logged once and wouldn't appear again. Logging at DEBUG for some that only logged once at startup means one would have to restart the jvm to see the log. That seems overkill for just simply get a log. One may as well learn/try to use jinfo/jconsole/ps/cmdline to get that info. I agree we wouldn't want every bit of details (that say available via jinfo) to be logged, that would noise the log quite a bit. On the other hand, the classpath itselfs provides a quick and strong validation to properly identify any misconfiguration on classpath settings. add java classpath to cassandra startup logging --- Key: CASSANDRA-2895 URL: https://issues.apache.org/jira/browse/CASSANDRA-2895 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: 2895.diff this is helpful to determine/verify if the Cassandra is started with the expected classpath it's a simple 1-liner addon that are useful... will submit a patch later. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2895) add java classpath to cassandra startup logging
[ https://issues.apache.org/jira/browse/CASSANDRA-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066331#comment-13066331 ] Jackson Chung commented on CASSANDRA-2895: -- svn diff ./src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java Index: src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java === --- src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java (revision 1147350) +++ src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java (working copy) @@ -101,6 +101,7 @@ { logger.info(JVM vendor/version: {}/{}, System.getProperty(java.vm.name), System.getProperty(java.version) ); logger.info(Heap size: {}/{}, Runtime.getRuntime().totalMemory(), Runtime.getRuntime().maxMemory()); + logger.info(Classpath: {}, System.getProperty(java.class.path)); CLibrary.tryMlockall(); listenPort = DatabaseDescriptor.getRpcPort(); {noformat} INFO [main] 2011-07-15 18:02:04,110 AbstractCassandraDaemon.java (line 104) Classpath: /home/jackson/work/cassandra-trunk/conf:/home/jackson/work/cassandra-trunk/build/classes/main:/home/jackson/work/cassandra-trunk/build/classes/thrift:/home/jackson/work/cassandra-trunk/lib/antlr-3.2.jar:/home/jackson/work/cassandra-trunk/lib/apache-cassandra-0.8.0-SNAPSHOT.jar:/home/jackson/work/cassandra-trunk/lib/apache-cassandra-cql-1.0.3.jar:/home/jackson/work/cassandra-trunk/lib/apache-cassandra-thrift-0.8.0-SNAPSHOT.jar:/home/jackson/work/cassandra-trunk/lib/avro-1.4.0-fixes.jar:/home/jackson/work/cassandra-trunk/lib/avro-1.4.0-sources-fixes.jar:/home/jackson/work/cassandra-trunk/lib/commons-cli-1.1.jar:/home/jackson/work/cassandra-trunk/lib/commons-codec-1.2.jar:/home/jackson/work/cassandra-trunk/lib/commons-lang-2.4.jar:/home/jackson/work/cassandra-trunk/lib/concurrentlinkedhashmap-lru-1.2.jar:/home/jackson/work/cassandra-trunk/lib/guava-r08.jar:/home/jackson/work/cassandra-trunk/lib/high-scale-lib-1.1.2.jar:/home/jackson/work/cassandra-trunk/lib/jackson-core-asl-1.4.0.jar:/home/jackson/work/cassandra-trunk/lib/jackson-mapper-asl-1.4.0.jar:/home/jackson/work/cassandra-trunk/lib/jamm-0.2.2.jar:/home/jackson/work/cassandra-trunk/lib/jline-0.9.94.jar:/home/jackson/work/cassandra-trunk/lib/jna.jar:/home/jackson/work/cassandra-trunk/lib/json-simple-1.1.jar:/home/jackson/work/cassandra-trunk/lib/libthrift-0.6.jar:/home/jackson/work/cassandra-trunk/lib/log4j-1.2.16.jar:/home/jackson/work/cassandra-trunk/lib/maven-ant-tasks-2.1.3.jar:/home/jackson/work/cassandra-trunk/lib/servlet-api-2.5-20081211.jar:/home/jackson/work/cassandra-trunk/lib/slf4j-api-1.6.1.jar:/home/jackson/work/cassandra-trunk/lib/slf4j-log4j12-1.6.1.jar:/home/jackson/work/cassandra-trunk/lib/snakeyaml-1.6.jar:/home/jackson/work/cassandra-trunk/lib/jamm-0.2.2.jar {noformat} add java classpath to cassandra startup logging --- Key: CASSANDRA-2895 URL: https://issues.apache.org/jira/browse/CASSANDRA-2895 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor this is helpful to determine/verify if the Cassandra is started with the expected classpath it's a simple 1-liner addon that are useful... will submit a patch later. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2895) add java classpath to cassandra startup logging
[ https://issues.apache.org/jira/browse/CASSANDRA-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-2895: - Attachment: 2895.diff trunk add java classpath to cassandra startup logging --- Key: CASSANDRA-2895 URL: https://issues.apache.org/jira/browse/CASSANDRA-2895 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: 2895.diff this is helpful to determine/verify if the Cassandra is started with the expected classpath it's a simple 1-liner addon that are useful... will submit a patch later. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2895) add java classpath to cassandra startup logging
add java classpath to cassandra startup logging --- Key: CASSANDRA-2895 URL: https://issues.apache.org/jira/browse/CASSANDRA-2895 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor this is helpful to determine/verify if the Cassandra is started with the expected classpath it's a simple 1-liner addon that are useful... will submit a patch later. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2824) assert err on SystemTable.getCurrentLocalNodeId during a cleanup
assert err on SystemTable.getCurrentLocalNodeId during a cleanup Key: CASSANDRA-2824 URL: https://issues.apache.org/jira/browse/CASSANDRA-2824 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung when running nodetool cleanup the following happened: $ ./bin/nodetool cleanup --host localhost Exception in thread main java.lang.AssertionError at org.apache.cassandra.db.SystemTable.getCurrentLocalNodeId(SystemTable.java:383) at org.apache.cassandra.utils.NodeId$LocalNodeIdHistory.init(NodeId.java:179) at org.apache.cassandra.utils.NodeId.clinit(NodeId.java:38) at org.apache.cassandra.utils.NodeId$OneShotRenewer.init(NodeId.java:159) at org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1317) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2808) add java vendor/versoin to cassandra startup logging
add java vendor/versoin to cassandra startup logging Key: CASSANDRA-2808 URL: https://issues.apache.org/jira/browse/CASSANDRA-2808 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor currently to determine which exact java is being used by the CassandraDaemon jvm could be difficult. Some may have use rpm/deb java package, other may have used tarbar and set JAVA_HOME, PATH, etc etc. It could be done, but may take some iteration to get the true answer if one's setup is complicated (user is root/cassandra and contains difference env settings between cassandra startup user vs the login user) It would be very helpful to have this information simply logged in the log file, right at the beginning. This helps identifying the java type/version quickly without much operation overhead, and easily done in 1-liner: logger.info(Java vendor/version: {}/{}, System.getProperty(java.vm.name), System.getProperty(java.version) ); In OpenJDK java, you will something similar to: INFO [main] 2011-06-22 07:08:16,610 AbstractCassandraDaemon.java (line 95) Java vendor/version: OpenJDK 64-Bit Server VM/1.6.0_20 In Java(TM), you will get something like: INFO [main] 2011-06-22 00:15:34,936 AbstractCassandraDaemon.java (line 96) Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24 this little edition will go a long way. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2808) add java vendor/versoin to cassandra startup logging
[ https://issues.apache.org/jira/browse/CASSANDRA-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-2808: - Attachment: 2808.patch patch based on trunk add java vendor/versoin to cassandra startup logging Key: CASSANDRA-2808 URL: https://issues.apache.org/jira/browse/CASSANDRA-2808 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: 2808.patch currently to determine which exact java is being used by the CassandraDaemon jvm could be difficult. Some may have use rpm/deb java package, other may have used tarbar and set JAVA_HOME, PATH, etc etc. It could be done, but may take some iteration to get the true answer if one's setup is complicated (user is root/cassandra and contains difference env settings between cassandra startup user vs the login user) It would be very helpful to have this information simply logged in the log file, right at the beginning. This helps identifying the java type/version quickly without much operation overhead, and easily done in 1-liner: logger.info(Java vendor/version: {}/{}, System.getProperty(java.vm.name), System.getProperty(java.version) ); In OpenJDK java, you will something similar to: INFO [main] 2011-06-22 07:08:16,610 AbstractCassandraDaemon.java (line 95) Java vendor/version: OpenJDK 64-Bit Server VM/1.6.0_20 In Java(TM), you will get something like: INFO [main] 2011-06-22 00:15:34,936 AbstractCassandraDaemon.java (line 96) Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24 this little edition will go a long way. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2813) more info on logging when SSTable cannot create the builder due to version mismatch
more info on logging when SSTable cannot create the builder due to version mismatch --- Key: CASSANDRA-2813 URL: https://issues.apache.org/jira/browse/CASSANDRA-2813 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: 2813.patch When run into the following: 2011-06-21 22:44:43,308 INFO [org.apache.cassandra.streaming.StreamOutSession] - Streaming to /10.128.64.163 2011-06-21 22:44:51,993 ERROR [org.apache.cassandra.service.AbstractCassandraDaemon] - Fatal exception in thread Thread[Thread-17651,5,main] java.lang.RuntimeException: Cannot recover SSTable with version a (current version f). at org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:237) at org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:938) at org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:107) at org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:112) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91) There is no indication on which SSTable is at fault. To recover from this, one would need to run nodetool scrub. This may however take some time, depending the SSTables' sizes, and it is possible that only 1 keyspace or CF is needed to be rebuilt by scrub. It'd be nice to print more details of the SSTable here in case the end-user prefers to just scrub the keyspace/cf in question. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2813) more info on logging when SSTable cannot create the builder due to version mismatch
[ https://issues.apache.org/jira/browse/CASSANDRA-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-2813: - Attachment: 2813.patch base on trunk more info on logging when SSTable cannot create the builder due to version mismatch --- Key: CASSANDRA-2813 URL: https://issues.apache.org/jira/browse/CASSANDRA-2813 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor Attachments: 2813.patch When run into the following: 2011-06-21 22:44:43,308 INFO [org.apache.cassandra.streaming.StreamOutSession] - Streaming to /10.128.64.163 2011-06-21 22:44:51,993 ERROR [org.apache.cassandra.service.AbstractCassandraDaemon] - Fatal exception in thread Thread[Thread-17651,5,main] java.lang.RuntimeException: Cannot recover SSTable with version a (current version f). at org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:237) at org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:938) at org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:107) at org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:112) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91) There is no indication on which SSTable is at fault. To recover from this, one would need to run nodetool scrub. This may however take some time, depending the SSTables' sizes, and it is possible that only 1 keyspace or CF is needed to be rebuilt by scrub. It'd be nice to print more details of the SSTable here in case the end-user prefers to just scrub the keyspace/cf in question. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2785) should export JAVA variable in the bin/cassandra and use that in the cassandra-env.sh when check for the java version
should export JAVA variable in the bin/cassandra and use that in the cassandra-env.sh when check for the java version - Key: CASSANDRA-2785 URL: https://issues.apache.org/jira/browse/CASSANDRA-2785 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung I forgot which jira we add this java -version check in the cassandra-env.sh (for adding jamm to the javaagent), but we should probably use the variable JAVA set in bin/cassandra (will need export) and use $JAVA instead of java in the cassandra-env.sh In a situation where JAVA_HOME may have been properly set as the Sun's java but the PATH still have the OpenJDK's java in front, the check will fail to add the jamm.jar, even though the cassandra jvm is properly started via the Sun's java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2762) Token cannot contain comma (possibly non-alpha/non-numeric too?) in OrderPreservingPartitioner
Token cannot contain comma (possibly non-alpha/non-numeric too?) in OrderPreservingPartitioner -- Key: CASSANDRA-2762 URL: https://issues.apache.org/jira/browse/CASSANDRA-2762 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung It'd appear that when the token contain comma in the OrderPreservingPartitioner case, C* will fail with assert error: ERROR [GossipStage:1] 2011-06-09 16:01:05,063 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[GossipStage:1,5,main] java.lang.AssertionError at org.apache.cassandra.service.StorageService.handleStateBootstrap(StorageService.java:685) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:648) at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:772) at org.apache.cassandra.gms.Gossiper.applyApplicationStateLocally(Gossiper.java:737) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:679) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:60) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2746) CliClient does not log root cause exception when catch it from executeCLIStatement
CliClient does not log root cause exception when catch it from executeCLIStatement -- Key: CASSANDRA-2746 URL: https://issues.apache.org/jira/browse/CASSANDRA-2746 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung When executing a statement from the cassandra-cli (with --debug) , if an exception is thrown from one of the cases in side the executeCLIStatement method, the root cause is swallowed. For specific case such as the InvalidRequestException or the SchemaDisagreementException, just the message itself maybe enough, but for the general Exception case, without the root cause, it could be difficult to debug the issue. For example, we have seen exception like: {noformat} null java.lang.RuntimeException at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:209) at org.apache.cassandra.cli.CliMain.processStatement(CliMain.java:223) at org.apache.cassandra.cli.CliMain.main(CliMain.java:351) {noformat} the null there would most likely indicate this is a NPE (though it could still be any Exception with null message). By adding a initCause to the caught exception, we could see the root cause, eg: {noformat} null java.lang.RuntimeException at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:212) at org.apache.cassandra.cli.CliMain.processStatement(CliMain.java:223) at org.apache.cassandra.cli.CliMain.main(CliMain.java:351) Caused by: java.lang.NullPointerException at org.apache.cassandra.cli.CliClient.describeKeySpace(CliClient.java:1336) at org.apache.cassandra.cli.CliClient.executeShowKeySpaces(CliClient.java:1166) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:170) ... 2 more {noformat} submitting a patch here that would add the initCause to all caught exceptions here. But the most important one is the general Exception case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2746) CliClient does not log root cause exception when catch it from executeCLIStatement
[ https://issues.apache.org/jira/browse/CASSANDRA-2746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackson Chung updated CASSANDRA-2746: - Attachment: patch2746.txt add initCause to exceptions caught from CliClient.executeCLIStatement CliClient does not log root cause exception when catch it from executeCLIStatement -- Key: CASSANDRA-2746 URL: https://issues.apache.org/jira/browse/CASSANDRA-2746 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Attachments: patch2746.txt When executing a statement from the cassandra-cli (with --debug) , if an exception is thrown from one of the cases in side the executeCLIStatement method, the root cause is swallowed. For specific case such as the InvalidRequestException or the SchemaDisagreementException, just the message itself maybe enough, but for the general Exception case, without the root cause, it could be difficult to debug the issue. For example, we have seen exception like: {noformat} null java.lang.RuntimeException at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:209) at org.apache.cassandra.cli.CliMain.processStatement(CliMain.java:223) at org.apache.cassandra.cli.CliMain.main(CliMain.java:351) {noformat} the null there would most likely indicate this is a NPE (though it could still be any Exception with null message). By adding a initCause to the caught exception, we could see the root cause, eg: {noformat} null java.lang.RuntimeException at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:212) at org.apache.cassandra.cli.CliMain.processStatement(CliMain.java:223) at org.apache.cassandra.cli.CliMain.main(CliMain.java:351) Caused by: java.lang.NullPointerException at org.apache.cassandra.cli.CliClient.describeKeySpace(CliClient.java:1336) at org.apache.cassandra.cli.CliClient.executeShowKeySpaces(CliClient.java:1166) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:170) ... 2 more {noformat} submitting a patch here that would add the initCause to all caught exceptions here. But the most important one is the general Exception case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2615) in cassandra-cli, the help command output on validation types should be updated
[ https://issues.apache.org/jira/browse/CASSANDRA-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032713#comment-13032713 ] Jackson Chung commented on CASSANDRA-2615: -- +1 [default@testks] assume Super4 comparator as ascii ... ; Assumption for column family 'Super4' added successfully. [default@testks] update column family Super4; INFO 15:20:59,545 Applying migration 1d332bd0-7ce6-11e0--fd7033aa10e7 Update column family to org.apache.cassandra.config.CFMetaData@6e72d873[cfId=1000,ksName=testks,cfName=Super4,cfType=Super,comparator=org.apache.cassandra.db.marshal.AsciiType,subcolumncomparator=org.apache.cassandra.db.marshal.BytesType,comment=,rowCacheSize=1.0,keyCacheSize=20.0,readRepairChance=1.0,replicateOnWrite=false,gcGraceSeconds=864000,defaultValidator=org.apache.cassandra.db.marshal.UTF8Type,keyValidator=org.apache.cassandra.db.marshal.BytesType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,memtableFlushAfterMins=1440,memtableThroughputInMb=242,memtableOperationsInMillions=1.134375,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=475 lim=478 cap=480],column_metadata={}] INFO 15:20:59,546 Enqueuing flush of Memtable-Migrations@903913131(7137/8921 serialized/live bytes, 1 ops) INFO 15:20:59,547 Writing Memtable-Migrations@903913131(7137/8921 serialized/live bytes, 1 ops) INFO 15:20:59,547 Enqueuing flush of Memtable-Schema@1257515479(2960/3700 serialized/live bytes, 3 ops) INFO 15:20:59,737 Completed flushing /var/lib/cassandra/data/system/Migrations-g-3-Data.db (7201 bytes) INFO 15:20:59,739 Writing Memtable-Schema@1257515479(2960/3700 serialized/live bytes, 3 ops) INFO 15:20:59,912 Completed flushing /var/lib/cassandra/data/system/Schema-g-3-Data.db (3110 bytes) 1d332bd0-7ce6-11e0--fd7033aa10e7 Waiting for schema agreement... ... schemas agree across the cluster in cassandra-cli, the help command output on validation types should be updated Key: CASSANDRA-2615 URL: https://issues.apache.org/jira/browse/CASSANDRA-2615 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.0 Attachments: CASSANDRA-2615.patch from cassandra-cli, say type help assume you will find: Supported values are: - AsciiType - BytesType - CounterColumnType (distributed counter column) - IntegerType (a generic variable-length integer type) - LexicalUUIDType - LongType - UTF8Type ok now: [default@cfs] assume inode comparator as UTF8Type; Type 'UTF8Type' was not found. Available: bytes, integer, long, lexicaluuid, timeuuid, utf8, ascii. so looks like the supported type list should be update by taking away the Type post-fix.. however, on the other hand, you can't really use it: [default@cfs] update column family inode; Unable to find abstract-type class 'org.apache.cassandra.db.marshal.utf8' looks like from the update, you still need the Type (case insensitive?) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030425#comment-13030425 ] Jackson Chung commented on CASSANDRA-2401: -- Here is 1 way that i could 100% reproduce the issue with data being null: Need 2 nodes, 1 is gonna to autobootstrap to the other. Also assuming completely clean start (blow up the /var/lib/cassandra/ or where ever data are stored i am also using brisk beta to test to start: node-A: 1) get brisk 2) start brisk with -t (jobtracker) 3) run a simple hive query : 3a) bin/brisk hive 3b) create table foo (bar INT); 3c) select count(*) from foo; 3d) exit; 4) every thing should be so far so good, let the brisk node continue to be up node-B: 1) get brisk 2) modify the resources/cassandra/conf/cassandra.yaml: 2a) to enable autobootstrap. 2b) point seeds to node-A 3) put a sleep or break point in o.a.c.service.StorageService.joinTokenRing method, right after MapInetAddress, Double loadinfo = StorageLoadBalancer.instance.getLoadInfo(); (personal preference: log a sleep line, add a thread.sleep(a_long_time)) 4) start brisk with -t on node-B 5) wait till the log line Joining: getting bootstrap token , it should now reaches your break point (or zz) 6) crash the jvm (personal preference: kill -9 pid) back to node-A 1) exit the jvm (BriskDaemon) normally (kill pid) 2) start the brisk node again (with -t): log from node-A: {noformat} INFO 23:25:00,213 Logging initialized SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/riptano/work/brisk/resources/cassandra/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/riptano/work/brisk/resources/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. INFO 23:25:00,235 Heap size: 510263296/511311872 INFO 23:25:00,237 JNA not found. Native methods will be disabled. INFO 23:25:00,263 Loading settings from file:/home/riptano/work/brisk/resources/cassandra/conf/cassandra.yaml INFO 23:25:00,470 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 23:25:00,496 Detected Hadoop trackers are enabled, setting my DC to Brisk INFO 23:25:00,696 Global memtable threshold is enabled at 162MB INFO 23:25:00,846 Opening /var/lib/cassandra/data/system/IndexInfo-f-1 INFO 23:25:00,912 Opening /var/lib/cassandra/data/system/Schema-f-2 INFO 23:25:00,926 Opening /var/lib/cassandra/data/system/Schema-f-1 INFO 23:25:00,951 Opening /var/lib/cassandra/data/system/Migrations-f-2 INFO 23:25:00,954 Opening /var/lib/cassandra/data/system/Migrations-f-1 INFO 23:25:00,970 Opening /var/lib/cassandra/data/system/LocationInfo-f-2 INFO 23:25:00,989 Opening /var/lib/cassandra/data/system/LocationInfo-f-1 INFO 23:25:01,089 Loading schema version c4fd2440-7900-11e0--ba846f9adcf7 INFO 23:25:01,499 Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1304810701499.log INFO 23:25:01,530 Replaying /var/lib/cassandra/commitlog/CommitLog-1304810455288.log INFO 23:25:01,675 Finished reading /var/lib/cassandra/commitlog/CommitLog-1304810455288.log INFO 23:25:01,730 Enqueuing flush of Memtable-MetaStore@102170028(869/1086 serialized/live bytes, 3 ops) INFO 23:25:01,735 Writing Memtable-MetaStore@102170028(869/1086 serialized/live bytes, 3 ops) INFO 23:25:01,743 Enqueuing flush of Memtable-sblocks@1075051425(3044096/3805120 serialized/live bytes, 17 ops) INFO 23:25:01,747 Enqueuing flush of Memtable-inode.path@780298059(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:01,748 Enqueuing flush of Memtable-inode.sentinel@1934329031(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:01,748 Enqueuing flush of Memtable-inode@1660575731(6393/7991 serialized/live bytes, 134 ops) INFO 23:25:01,821 Completed flushing /var/lib/cassandra/data/HiveMetaStore/MetaStore-f-1-Data.db (989 bytes) INFO 23:25:01,832 Writing Memtable-sblocks@1075051425(3044096/3805120 serialized/live bytes, 17 ops) INFO 23:25:01,927 Completed flushing /var/lib/cassandra/data/cfs/sblocks-f-1-Data.db (3045448 bytes) INFO 23:25:01,928 Writing Memtable-inode.path@780298059(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:01,968 Completed flushing /var/lib/cassandra/data/cfs/inode.path-f-1-Data.db (5346 bytes) INFO 23:25:01,969 Writing Memtable-inode.sentinel@1934329031(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:02,035 Completed flushing /var/lib/cassandra/data/cfs/inode.sentinel-f-1-Data.db (1735 bytes) INFO 23:25:02,036 Writing Memtable-inode@1660575731(6393/7991 serialized/live bytes, 134 ops) INFO 23:25:02,085 Completed flushing /var/lib/cassandra/data/cfs/inode-f-1-Data.db (8582 bytes) INFO 23:25:02,087 Log replay complete INFO 23:25:02,092 Cassandra version:
[jira] [Created] (CASSANDRA-2615) in cassandra-cli, the help command output on validation types should be updated
in cassandra-cli, the help command output on validation types should be updated Key: CASSANDRA-2615 URL: https://issues.apache.org/jira/browse/CASSANDRA-2615 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung Assignee: Pavel Yaskevich Priority: Minor from cassandra-cli, say type help assume you will find: Supported values are: - AsciiType - BytesType - CounterColumnType (distributed counter column) - IntegerType (a generic variable-length integer type) - LexicalUUIDType - LongType - UTF8Type ok now: [default@cfs] assume inode comparator as UTF8Type; Type 'UTF8Type' was not found. Available: bytes, integer, long, lexicaluuid, timeuuid, utf8, ascii. so looks like the supported type list should be update by taking away the Type post-fix.. however, on the other hand, you can't really use it: [default@cfs] update column family inode; Unable to find abstract-type class 'org.apache.cassandra.db.marshal.utf8' looks like from the update, you still need the Type (case insensitive?) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2619) secondary index not dropped until restart
secondary index not dropped until restart - Key: CASSANDRA-2619 URL: https://issues.apache.org/jira/browse/CASSANDRA-2619 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung when dropping the secondary index (via cassandra-cli), the describe keyspace still shows the Built index entry. Only after a restart of the CassandraDaemon then the Built Index entry is gone. This seems indicate a problem with the index not really been dropped completed. to test, use a single node, create an index, then drop it from the cli (issue an update column family ... with metadata fields but not the index info) below is the original: Column Families: ColumnFamily: inode Stores file meta data Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 0.0/14400 Memtable thresholds: 0.103125/22/1440 (millions of ops/MB/minutes) GC grace seconds: 60 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false {color:red}Built indexes: [inode.path, inode.sentinel]{color} Column Metadata: Column Name: path (70617468) Validation Class: org.apache.cassandra.db.marshal.BytesType {color:red}Index Name: path Index Type: KEYS{color} Column Name: sentinel (73656e74696e656c) Validation Class: org.apache.cassandra.db.marshal.BytesType {color:red}Index Name: sentinel Index Type: KEYS{color} issue an update: {noformat} [default@unknown] use cfs; Authenticated to keyspace: cfs [default@cfs] update column family inode with comparator=BytesType and column_metadata=[{column_name:70617468, validation_class:BytesType}, {column_name:73656e74696e656c,validation_class:BytesType}]; fca46d00-783c-11e0--242d50cf1fff Waiting for schema agreement... ... schemas agree across the cluster {noformat} describe the keyspace again: Keyspace: cfs: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Options: [Brisk:1, Cassandra:0] Column Families: ColumnFamily: inode Stores file meta data Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 0.0/14400 Memtable thresholds: 0.103125/22/1440 (millions of ops/MB/minutes) GC grace seconds: 60 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false {color:red}Built indexes: [inode.path, inode.sentinel]{color} Column Metadata: Column Name: path (70617468) Validation Class: org.apache.cassandra.db.marshal.BytesType Column Name: sentinel (73656e74696e656c) Validation Class: org.apache.cassandra.db.marshal.BytesType *notice the red line on Built Indexes* restart CassandraDaemon, describe again: Keyspace: cfs: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Options: [Brisk:1, Cassandra:0] Column Families: ColumnFamily: inode Stores file meta data Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 0.0/14400 Memtable thresholds: 0.103125/22/1440 (millions of ops/MB/minutes) GC grace seconds: 60 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false {color:red}Built indexes: []{color} Column Metadata: Column Name: path (70617468) Validation Class: org.apache.cassandra.db.marshal.BytesType Column Name: sentinel (73656e74696e656c) Validation Class: org.apache.cassandra.db.marshal.BytesType on another note, upon re-create the index, it does not appear the index is actually rebuilt. There is no need to restart CassandraDaemon for the Built Index to show up from the describe. But the update goes very fast. We could tell the index is not being rebuilt because we were getting NPE from: {noformat} java.lang.RuntimeException: java.lang.NullPointerException at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at
[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029005#comment-13029005 ] Jackson Chung commented on CASSANDRA-2401: -- I have an existing data that was resulting similar NPE before the patch. After applying the patch, the following observed: {noformat} DEBUG [ReadStage:82] 2011-05-04 21:23:27,114 ColumnFamilyStore.java (line 1514) fetched data row ColumnFamily(inode -deleted at 130436368- [70617468:false:49@1304363600219,]) DEBUG [ReadStage:82] 2011-05-04 21:23:27,114 ColumnFamilyStore.java (line 1532) row ColumnFamily(inode -deleted at 130436368- [70617468:false:49@1304363600219,]) satisfies all clauses DEBUG [ReadStage:82] 2011-05-04 21:23:27,115 ColumnFamilyStore.java (line 1514) fetched data row ColumnFamily(inode [70617468:false:10@1304353355296,]) ERROR [ReadStage:82] 2011-05-04 21:23:27,115 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[ReadStage:82,5,main] java.lang.AssertionError: No data found for NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in DecoratedKey(29842926756667498147838693957802723793, 3134346637326336393966396130336561376538623330316566383561616131):QueryPath(columnFamilyName='inode', superColumnName='null', columnName='null') (original filter NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from expression 73656e74696e656cEQ78 at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1512) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} was the fix intend to avoid future problem, as such existing problem would need a workaround solution? getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query --- Key: CASSANDRA-2401 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse Reporter: Tey Kar Shiang Assignee: Jonathan Ellis Fix For: 0.7.6 Attachments: 2401.txt ColumnFamilyStore.java, line near 1680, ColumnFamily data = getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned null, causing NULL exception in satisfies(data, clause, primary) which is not captured. The callback got timeout and return a Timeout exception to Hector. The data is empty, as I traced, I have the the columns Count as 0 in removeDeletedCF(), which return the null there. (I am new and trying to understand the logics around still). Instead of crash to NULL, could we bypass the data? About my test: A stress-test program to add, modify and delete data to keyspace. I have 30 threads simulate concurrent users to perform the actions above, and do a query to all rows periodically. I have Column Family with rows (as File) and columns as index (e.g. userID, fileType). No issue on the first day of test, and stopped for 3 days. I restart the test on 4th day, 1 of the users failed to query the files (timeout exception received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2528) NPE from PrecompactedRow
NPE from PrecompactedRow Key: CASSANDRA-2528 URL: https://issues.apache.org/jira/browse/CASSANDRA-2528 Project: Cassandra Issue Type: Bug Reporter: Jackson Chung received a NPE from trunk (0.8) on PrecompactedRow: ERROR [CompactionExecutor:2] 2011-04-21 17:21:31,610 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:86) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:44) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:553) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) size of data in /var/lib/cassandra is 11G on this, but there is also report that 1.7G also see the same. data was previously populated from 0.7.4 cassandra added debug logging, not sure how much this help (this is logged before the exception.) INFO [CompactionExecutor:2] 2011-04-21 17:21:31,588 CompactionManager.java (line 534) Compacting Major: [SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-10-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-7-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-6-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-8-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-9-Data.db')] DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-10-Data.db : 256 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-7-Data.db : 512 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-6-Data.db : 768 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-8-Data.db : 1024 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-9-Data.db : 1280 INFO [CompactionExecutor:2] 2011-04-21 17:21:31,609 CompactionIterator.java (line 185) Major@1181554512(cfs, inode.path, 523/10895) now compacting at 16777 bytes/ms. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2528) NPE from PrecompactedRow
[ https://issues.apache.org/jira/browse/CASSANDRA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022884#comment-13022884 ] Jackson Chung commented on CASSANDRA-2528: -- sure, here is the Assertion Error, the null is on metadata: NPE from PrecompactedRow Key: CASSANDRA-2528 URL: https://issues.apache.org/jira/browse/CASSANDRA-2528 Project: Cassandra Issue Type: Bug Affects Versions: 0.8 beta 1 Reporter: Jackson Chung Assignee: Jonathan Ellis Fix For: 0.8.0 Attachments: 2528.txt received a NPE from trunk (0.8) on PrecompactedRow: ERROR [CompactionExecutor:2] 2011-04-21 17:21:31,610 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:86) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:44) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:553) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) size of data in /var/lib/cassandra is 11G on this, but there is also report that 1.7G also see the same. data was previously populated from 0.7.4 cassandra added debug logging, not sure how much this help (this is logged before the exception.) INFO [CompactionExecutor:2] 2011-04-21 17:21:31,588 CompactionManager.java (line 534) Compacting Major: [SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-10-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-7-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-6-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-8-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-9-Data.db')] DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-10-Data.db : 256 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-7-Data.db : 512 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-6-Data.db : 768 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-8-Data.db : 1024 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-9-Data.db : 1280 INFO [CompactionExecutor:2] 2011-04-21 17:21:31,609 CompactionIterator.java (line 185) Major@1181554512(cfs, inode.path, 523/10895) now compacting at 16777 bytes/ms. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2528) NPE from PrecompactedRow
[ https://issues.apache.org/jira/browse/CASSANDRA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022885#comment-13022885 ] Jackson Chung commented on CASSANDRA-2528: -- DEBUG [CompactionExecutor:2] 2011-04-21 19:31:46,797 PrecompactedRow.java (line 85) debugging controoler true DEBUG [CompactionExecutor:2] 2011-04-21 19:31:46,798 PrecompactedRow.java (line 87) debugging compactedCF: ColumnFamily(anonymous [3636363663643736663936393536343639653762653339643735306363376439:false:0@1303340825329,]) INFO [CompactionExecutor:2] 2011-04-21 19:31:46,799 CompactionIterator.java (line 185) Major@31583366(cfs, inode.path, 772/11720) now compacting at 8388 bytes/ms. ERROR [CompactionExecutor:2] 2011-04-21 19:31:46,850 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.AssertionError at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:91) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:44) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:553) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) NPE from PrecompactedRow Key: CASSANDRA-2528 URL: https://issues.apache.org/jira/browse/CASSANDRA-2528 Project: Cassandra Issue Type: Bug Affects Versions: 0.8 beta 1 Reporter: Jackson Chung Assignee: Jonathan Ellis Fix For: 0.8.0 Attachments: 2528.txt received a NPE from trunk (0.8) on PrecompactedRow: ERROR [CompactionExecutor:2] 2011-04-21 17:21:31,610 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:86) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:44) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:553) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) size of data in /var/lib/cassandra is 11G on this, but there is also report that 1.7G also see the same. data was previously populated from 0.7.4 cassandra added debug logging, not sure how much this help (this is logged before the exception.) INFO [CompactionExecutor:2] 2011-04-21 17:21:31,588 CompactionManager.java (line 534) Compacting Major:
[jira] [Commented] (CASSANDRA-2528) NPE from PrecompactedRow
[ https://issues.apache.org/jira/browse/CASSANDRA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022886#comment-13022886 ] Jackson Chung commented on CASSANDRA-2528: -- {code:title=PrepcompactedRow.java} 85 logger.debug(String.format(debugging controoler %b, controller.shouldPurge(key))); 86 compactedCf = controller.shouldPurge(key) ? ColumnFamilyStore.removeDeleted(cf, controller.gcBefore) : cf; 87 logger.debug(String.format(debugging compactedCF: %s, compactedCf.toString())); 88 //if (compactedCf != null compactedCf.metadata().getDefaultValidator().isCommutative()) 89 if (compactedCf != null) 90 { 91 assert compactedCf.metadata() != null; 92 assert compactedCf.metadata().getDefaultValidator() != null; 93 if (compactedCf.metadata().getDefaultValidator().isCommutative()) 94 { 95 CounterColumn.removeOldShards(compactedCf, controller.gcBefore); 96 } {code} NPE from PrecompactedRow Key: CASSANDRA-2528 URL: https://issues.apache.org/jira/browse/CASSANDRA-2528 Project: Cassandra Issue Type: Bug Affects Versions: 0.8 beta 1 Reporter: Jackson Chung Assignee: Jonathan Ellis Fix For: 0.8.0 Attachments: 2528.txt received a NPE from trunk (0.8) on PrecompactedRow: ERROR [CompactionExecutor:2] 2011-04-21 17:21:31,610 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:86) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:44) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:553) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) size of data in /var/lib/cassandra is 11G on this, but there is also report that 1.7G also see the same. data was previously populated from 0.7.4 cassandra added debug logging, not sure how much this help (this is logged before the exception.) INFO [CompactionExecutor:2] 2011-04-21 17:21:31,588 CompactionManager.java (line 534) Compacting Major: [SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-10-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-7-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-6-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-8-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-9-Data.db')] DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-10-Data.db : 256 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-7-Data.db : 512 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-6-Data.db : 768 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-8-Data.db : 1024 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc
[jira] [Commented] (CASSANDRA-2528) NPE from PrecompactedRow
[ https://issues.apache.org/jira/browse/CASSANDRA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022887#comment-13022887 ] Jackson Chung commented on CASSANDRA-2528: -- $ grep java.lang.AssertionError -A3 /var/log/cassandra/system.log {noformat} java.lang.AssertionError at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:91) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) -- java.lang.AssertionError at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:91) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) {noformat} NPE from PrecompactedRow Key: CASSANDRA-2528 URL: https://issues.apache.org/jira/browse/CASSANDRA-2528 Project: Cassandra Issue Type: Bug Affects Versions: 0.8 beta 1 Reporter: Jackson Chung Assignee: Jonathan Ellis Fix For: 0.8.0 Attachments: 2528.txt received a NPE from trunk (0.8) on PrecompactedRow: ERROR [CompactionExecutor:2] 2011-04-21 17:21:31,610 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:86) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:44) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:553) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) size of data in /var/lib/cassandra is 11G on this, but there is also report that 1.7G also see the same. data was previously populated from 0.7.4 cassandra added debug logging, not sure how much this help (this is logged before the exception.) INFO [CompactionExecutor:2] 2011-04-21 17:21:31,588 CompactionManager.java (line 534) Compacting Major: [SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-10-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-7-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-6-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-8-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-9-Data.db')] DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-10-Data.db : 256 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-7-Data.db : 512 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-6-Data.db : 768 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-8-Data.db : 1024 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-9-Data.db : 1280 INFO [CompactionExecutor:2] 2011-04-21 17:21:31,609 CompactionIterator.java (line 185) Major@1181554512(cfs, inode.path, 523/10895) now
[jira] [Commented] (CASSANDRA-2528) NPE from PrecompactedRow
[ https://issues.apache.org/jira/browse/CASSANDRA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022962#comment-13022962 ] Jackson Chung commented on CASSANDRA-2528: -- the fix does address the NPE as now all CF has metadata. NPE from PrecompactedRow Key: CASSANDRA-2528 URL: https://issues.apache.org/jira/browse/CASSANDRA-2528 Project: Cassandra Issue Type: Bug Affects Versions: 0.8 beta 1 Reporter: Jackson Chung Assignee: Jonathan Ellis Fix For: 0.8.0 Attachments: 2528.txt, 2528.txt received a NPE from trunk (0.8) on PrecompactedRow: ERROR [CompactionExecutor:2] 2011-04-21 17:21:31,610 AbstractCassandraDaemon.java (line 112) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:86) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:167) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:124) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:44) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:553) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) size of data in /var/lib/cassandra is 11G on this, but there is also report that 1.7G also see the same. data was previously populated from 0.7.4 cassandra added debug logging, not sure how much this help (this is logged before the exception.) INFO [CompactionExecutor:2] 2011-04-21 17:21:31,588 CompactionManager.java (line 534) Compacting Major: [SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-10-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-7-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-6-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-8-Data.db'), SSTableReader(path='/var/lib/cassandra/data/cfs/inode.path-f-9-Data.db')] DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-10-Data.db : 256 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-7-Data.db : 512 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,588 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-6-Data.db : 768 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-8-Data.db : 1024 DEBUG [CompactionExecutor:2] 2011-04-21 17:21:31,589 SSTableReader.java (line 132) index size for bloom filter calc for file : /var/lib/cassandra/data/cfs/inode.path-f-9-Data.db : 1280 INFO [CompactionExecutor:2] 2011-04-21 17:21:31,609 CompactionIterator.java (line 185) Major@1181554512(cfs, inode.path, 523/10895) now compacting at 16777 bytes/ms. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2499) cassandra-env.sh pattern matching for OpenJDK broken in some cases
[ https://issues.apache.org/jira/browse/CASSANDRA-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022254#comment-13022254 ] Jackson Chung commented on CASSANDRA-2499: -- try this: -java_version=`java -version 21` -if [[ $java_version != *OpenJDK* ]] +check_openjdk=$(java -version 21 | awk '{if (NR == 2) {print $1}}') +if [ $check_openjdk != OpenJDK ] {noformat}$ bash /tmp/testjdk java version: Java(TM) not OpenJDK $ dash /tmp/testjdk /usr/bin/java version: OpenJDK yes OpenJDK $ cat /tmp/testjdk check_openjdk=$($1 -version 21 | awk '{if (NR ==2) {print $1}}') echo version: $check_openjdk if [ $check_openjdk != OpenJDK ] then echo not OpenJDK JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.1.jar else echo yes OpenJDK fi {noformat} cassandra-env.sh pattern matching for OpenJDK broken in some cases -- Key: CASSANDRA-2499 URL: https://issues.apache.org/jira/browse/CASSANDRA-2499 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8 Reporter: Tyler Hobbs Assignee: Jackson Chung Fix For: 0.8 With bash version 4.1.5, the section of cassandra-env that tries to match the JDK distribution seems to have some kind of syntax error. I get the following message when running bin/cassandra: {noformat} bin/../conf/cassandra-env.sh: 99: [[: not found {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2520) use -z to test for empty variables to make sh and dash happy
[ https://issues.apache.org/jira/browse/CASSANDRA-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022320#comment-13022320 ] Jackson Chung commented on CASSANDRA-2520: -- any interest to do it all? $ grep x\\$ ./bin/* ./conf/* -r ./bin/cassandra:if [ x$CASSANDRA_INCLUDE = x ]; then ./bin/cassandra:if [ x$pidpath != x ]; then ./bin/cassandra:if [ x$foreground != x ]; then ./bin/cassandra-cli:if [ x$CASSANDRA_INCLUDE = x ]; then ./bin/cassandra.in.sh:if [ x$CASSANDRA_HOME = x ]; then ./bin/cassandra.in.sh:if [ x$CASSANDRA_CONF = x ]; then ./bin/clustertool:if [ x$CASSANDRA_INCLUDE = x ]; then ./bin/json2sstable:if [ x$CASSANDRA_INCLUDE = x ]; then ./bin/nodetool:if [ x$CASSANDRA_INCLUDE = x ]; then ./bin/sstable2json:if [ x$CASSANDRA_INCLUDE = x ]; then ./bin/sstablekeys:if [ x$CASSANDRA_INCLUDE = x ]; then ./conf/cassandra-env.sh:if [ x$MAX_HEAP_SIZE = x ] [ x$HEAP_NEWSIZE = x ]; then ./conf/cassandra-env.sh:if [ x$MAX_HEAP_SIZE = x ] || [ x$HEAP_NEWSIZE = x ]; then ./conf/cassandra-env.sh:JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE} use -z to test for empty variables to make sh and dash happy Key: CASSANDRA-2520 URL: https://issues.apache.org/jira/browse/CASSANDRA-2520 Project: Cassandra Issue Type: Bug Components: Packaging Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Fix For: 0.7.5 Attachments: 2520-0.7.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2499) cassandra-env.sh pattern matching for OpenJDK broken in some cases
[ https://issues.apache.org/jira/browse/CASSANDRA-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13021889#comment-13021889 ] Jackson Chung commented on CASSANDRA-2499: -- chances are the default /bin/sh is linked to dash, please run a ls -al /bin/sh to confirm a minor fix could be remove the [[ and add around the variables and values. -if [[ $java_version != *OpenJDK* ]] +if [ $java_version != *OpenJDK* ] a more drastic fix is to update all the /bin/sh with /bin/bash (if desired to only support on bash) see ref: https://wiki.ubuntu.com/DashAsBinSh suggest to be reviewed to make a decision. cassandra-env.sh pattern matching for OpenJDK broken in some cases -- Key: CASSANDRA-2499 URL: https://issues.apache.org/jira/browse/CASSANDRA-2499 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8 Reporter: Tyler Hobbs Assignee: Eric Evans Fix For: 0.8 With bash version 4.1.5, the section of cassandra-env that tries to match the JDK distribution seems to have some kind of syntax error. I get the following message when running bin/cassandra: {noformat} bin/../conf/cassandra-env.sh: 99: [[: not found {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2499) cassandra-env.sh pattern matching for OpenJDK broken in some cases
[ https://issues.apache.org/jira/browse/CASSANDRA-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13021889#comment-13021889 ] Jackson Chung edited comment on CASSANDRA-2499 at 4/20/11 1:11 AM: --- chances are the default /bin/sh is linked to dash, please run a ls -al /bin/sh to confirm a minor fix could be remove the [[ and add around the variables and values. -if [[ $java_version != \*OpenJDK\* ]] +if [ $java_version != \*OpenJDK\* ] a more drastic fix is to update all the /bin/sh with /bin/bash (if desired to only support on bash) see ref: https://wiki.ubuntu.com/DashAsBinSh suggest to be reviewed to make a decision. was (Author: cywjackson): chances are the default /bin/sh is linked to dash, please run a ls -al /bin/sh to confirm a minor fix could be remove the [[ and add around the variables and values. -if [[ $java_version != *OpenJDK* ]] +if [ $java_version != *OpenJDK* ] a more drastic fix is to update all the /bin/sh with /bin/bash (if desired to only support on bash) see ref: https://wiki.ubuntu.com/DashAsBinSh suggest to be reviewed to make a decision. cassandra-env.sh pattern matching for OpenJDK broken in some cases -- Key: CASSANDRA-2499 URL: https://issues.apache.org/jira/browse/CASSANDRA-2499 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8 Reporter: Tyler Hobbs Assignee: Eric Evans Fix For: 0.8 With bash version 4.1.5, the section of cassandra-env that tries to match the JDK distribution seems to have some kind of syntax error. I get the following message when running bin/cassandra: {noformat} bin/../conf/cassandra-env.sh: 99: [[: not found {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2489) better not close the out/error stream from CassandraDaemon
better not close the out/error stream from CassandraDaemon -- Key: CASSANDRA-2489 URL: https://issues.apache.org/jira/browse/CASSANDRA-2489 Project: Cassandra Issue Type: Improvement Reporter: Jackson Chung Priority: Minor When cassandra is started in the background (-p /tmp/pidfile.pid), the System.out System.err is closed. That may not be suitable in case where capturing thread dump via traditional sigquit is preferred (kill -3 pid ). This could be useful especially when jmx collection is failing, or ability to do some little script to continuously capture thread dump via a script in the background. closing System.err could also potentially be swallowing fatal error that crashing the jvm. I would suggest make change to the stocked cassandra start script, when the intends is not running in foreground, (ie running in background) then redirect the standard output/error to a file, eg: /var/log/cassandra/cassandra.out 21 currently i apply a workaround to the script by faking the cassandra-foreground flag as -Dcassandra-foreground=no since the check is simply look for null: {code:title=AbstracCassandraDaemon.activate()} if (System.getProperty(cassandra-foreground) == null) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira