[jira] [Updated] (CASSANDRA-12860) Nodetool repair fragile: cannot properly recover from single node failure. Has to restart all nodes in order to repair again
[ https://issues.apache.org/jira/browse/CASSANDRA-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Wu updated CASSANDRA-12860: Description: Summary of symptom: - Set up is a multi-region cluster in AWS (5 regions). Each region has at least 4 hosts with RF=1/2 number of nodes, using V-nodes (256) - How to reproduce: -- On node A, start this repair job (again we are running fresh 3.5.0): {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job starts fine, reporting progress like {noformat} [2016-10-28 22:37:52,692] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:38:35,099] Repair session 36f13450-9d5f-11e6-8bf7-a9f47ff986a9 for range [(4029874034937227774,4033949979656106020]] finished (progress: 1%) [2016-10-28 22:38:38,769] Repair session 36f30910-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-2395606719402271267,-2394525508513518837]] finished (progress: 1%) [2016-10-28 22:38:48,521] Repair session 36f3f370-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-5223108861718702793,-5221117649630514419]] finished (progress: 2%) {noformat} -- Then manually shutdown another node (node B) in the same region (haven't tried with other region yet but expect the same behavior from past experience) -- Shortly after that seeing this message from job log (as well as in system.log) on node A: {noformat} [2016-10-28 22:41:46,268] Repair session 37088ce1-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-928974038666914990,-927967994563261540]] failed with error Endpoint /node_B_ip died (progress: 51%) {noformat} -- From this point on, repair job seems to hang: --- no further messages from job log --- nor any related messages in system.log --- CPU stayed low (low single digit percent of 1 CPU) -- After an hour (1hr), manually kill the repair jobs using "ps -eaf | grep repair" -- Restart C* on node A --- Verified system is up and no error messages in system.log --- Also verified that there is no error messages from node B -- After node A settles down (e.g. no new messages from system.log), restart the same repair job: {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job failes pretty quickly, reporting error from more nodes B and K: {noformat} [y...@cass-tm-1b-012.apse1.mashery.com ~]$ tail -f /tmp/repair.log [2016-10-28 22:49:52,965] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:50:15,839] Repair session e4180720-9d60-11e6-b2f9-cb9524b3c536 for range [(4029874034937227774,4033949979656106020]] failed with error [repair #e4180720-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(4029874034937227774,4033949979656106020]]] Validation failed in /node_K_ip (progress: 1%) [2016-10-28 22:50:17,158] Repair session e419dbe0-9d60-11e6-b2f9-cb9524b3c536 for range [(-2395606719402271267,-2394525508513518837]] failed with error [repair #e419dbe0-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-2395606719402271267,-2394525508513518837]]] Validation failed in /node_B_ip (progress: 1%) [2016-10-28 22:50:18,256] Repair session e41b1460-9d60-11e6-b2f9-cb9524b3c536 for range [(-5223108861718702793,-5221117649630514419]] failed with error [repair #e41b1460-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-5223108861718702793,-5221117649630514419]]] Validation failed in /node_B_ip (progress: 2%) {noformat} -- On the said nodes (B and K), seeing similar errors: {noformat} ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 Validator.java:261 - Failed creating a merkle tree for [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02 on myks/atable, [(4029874034937227774,4033949979656106020]]], /52.220.127.190 (see log for details) INFO [AntiEntropyStage:1] 2016-10-28 22:58:45,307 Validator.java:274 - [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02] Sending completed merkle tree to /52.220.127.190 for myks.xtable ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,308 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:5,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1321) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1211) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionM
[jira] [Updated] (CASSANDRA-12860) Nodetool repair fragile: cannot properly recover from single node failure. Has to restart all nodes in order to repair again
[ https://issues.apache.org/jira/browse/CASSANDRA-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Wu updated CASSANDRA-12860: Description: Summary of symptom: - Set up is a multi-region cluster in AWS (5 regions). Each region has at least 4 hosts with RF=1/2 number of nodes, using V-nodes (256) - How to reproduce: -- On node A, start this repair job (again we are running fresh 3.5.0): {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job starts fine, reporting progress like {noformat} [2016-10-28 22:37:52,692] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:38:35,099] Repair session 36f13450-9d5f-11e6-8bf7-a9f47ff986a9 for range [(4029874034937227774,4033949979656106020]] finished (progress: 1%) [2016-10-28 22:38:38,769] Repair session 36f30910-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-2395606719402271267,-2394525508513518837]] finished (progress: 1%) [2016-10-28 22:38:48,521] Repair session 36f3f370-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-5223108861718702793,-5221117649630514419]] finished (progress: 2%) {noformat} -- Then manually shutdown another node (node B) in the same region (haven't tried with other region yet but expect the same behavior from past experience) -- Shortly after that seeing this message from job log (as well as in system.log) on node A: {noformat} [2016-10-28 22:41:46,268] Repair session 37088ce1-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-928974038666914990,-927967994563261540]] failed with error Endpoint /node_B_ip died (progress: 51%) {noformat} -- From this point on, repair job seems to hang: --- no further messages from job log --- nor any related messages in system.log --- CPU stayed low (low single digit percent of 1 CPU) -- After an hour (1hr), manually kill the repair jobs using "ps -eaf | grep repair" -- Restart C* on node A --- Verified system is up and no error messages in system.log --- Also verified that there is no error messages from node B -- After node A settles down (e.g. no new messages from system.log), restart the same repair job: {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job failes pretty quickly, reporting error from more nodes B and K: {noformat} [y...@cass-tm-1b-012.apse1.mashery.com ~]$ tail -f /tmp/repair.log [2016-10-28 22:49:52,965] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:50:15,839] Repair session e4180720-9d60-11e6-b2f9-cb9524b3c536 for range [(4029874034937227774,4033949979656106020]] failed with error [repair #e4180720-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(4029874034937227774,4033949979656106020]]] Validation failed in /node_K_ip (progress: 1%) [2016-10-28 22:50:17,158] Repair session e419dbe0-9d60-11e6-b2f9-cb9524b3c536 for range [(-2395606719402271267,-2394525508513518837]] failed with error [repair #e419dbe0-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-2395606719402271267,-2394525508513518837]]] Validation failed in /node_B_ip (progress: 1%) [2016-10-28 22:50:18,256] Repair session e41b1460-9d60-11e6-b2f9-cb9524b3c536 for range [(-5223108861718702793,-5221117649630514419]] failed with error [repair #e41b1460-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-5223108861718702793,-5221117649630514419]]] Validation failed in /node_B_ip (progress: 2%) {noformat} -- No the said nodes (B and K), seeing the same error: {noformat} ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 Validator.java:261 - Failed creating a merkle tree for [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02 on myks/atable, [(4029874034937227774,4033949979656106020]]], /52.220.127.190 (see log for details) INFO [AntiEntropyStage:1] 2016-10-28 22:58:45,307 Validator.java:274 - [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02] Sending completed merkle tree to /52.220.127.190 for myks.xtable ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,308 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:5,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1321) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1211) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionM
[jira] [Updated] (CASSANDRA-12860) Nodetool repair fragile: cannot properly recover from single node failure. Has to restart all nodes in order to repair again
[ https://issues.apache.org/jira/browse/CASSANDRA-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Wu updated CASSANDRA-12860: Description: Summary of symptom: - Set up is a multi-region cluster in AWS (5 regions). Each region has at least 4 hosts with RF=1/2 number of nodes, using V-nodes (256) - How to reproduce: -- On node A, start this repair job (again we are running fresh 3.5.0): {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job starts fine, reporting progress like {noformat} [2016-10-28 22:37:52,692] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:38:35,099] Repair session 36f13450-9d5f-11e6-8bf7-a9f47ff986a9 for range [(4029874034937227774,4033949979656106020]] finished (progress: 1%) [2016-10-28 22:38:38,769] Repair session 36f30910-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-2395606719402271267,-2394525508513518837]] finished (progress: 1%) [2016-10-28 22:38:48,521] Repair session 36f3f370-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-5223108861718702793,-5221117649630514419]] finished (progress: 2%) {noformat} -- Then manually shutdown another node (node B) in the same region (haven't tried with other region yet but expect the same behavior from past experience) -- Shortly after that seeing this message from job log (as well as in system.log) on node A: {noformat} [2016-10-28 22:41:46,268] Repair session 37088ce1-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-928974038666914990,-927967994563261540]] failed with error Endpoint /node_B_ip died (progress: 51%) {noformat} -- From this point on, repair job seems to hang: --- no further messages from job log --- nor any related messages in system.log --- CPU stayed low (low single digit percent of 1 CPU) -- After an hour (1hr), manually kill the repair jobs using "ps -eaf | grep repair" -- Restart C* on node A --- Verified system is up and no error messages in system.log --- Also verified that there is no error messages from node B -- After node A settles down (e.g. no new messages from system.log), restart the same repair job: {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job failes pretty quickly, reporting error from more nodes B and K: {noformat} [y...@cass-tm-1b-012.apse1.mashery.com ~]$ tail -f /tmp/repair.log [2016-10-28 22:49:52,965] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:50:15,839] Repair session e4180720-9d60-11e6-b2f9-cb9524b3c536 for range [(4029874034937227774,4033949979656106020]] failed with error [repair #e4180720-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(4029874034937227774,4033949979656106020]]] Validation failed in /node_K_ip (progress: 1%) [2016-10-28 22:50:17,158] Repair session e419dbe0-9d60-11e6-b2f9-cb9524b3c536 for range [(-2395606719402271267,-2394525508513518837]] failed with error [repair #e419dbe0-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-2395606719402271267,-2394525508513518837]]] Validation failed in /node_B_ip (progress: 1%) [2016-10-28 22:50:18,256] Repair session e41b1460-9d60-11e6-b2f9-cb9524b3c536 for range [(-5223108861718702793,-5221117649630514419]] failed with error [repair #e41b1460-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-5223108861718702793,-5221117649630514419]]] Validation failed in /node_B_ip (progress: 2%) {noformat} -- No the said nodes (B and K), seeing the same error: {noformat} ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 Validator.java:261 - Failed creating a merkle tree for [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02 on myks/atable, [(4029874034937227774,4033949979656106020]]], /52.220.127.190 (see log for details) INFO [AntiEntropyStage:1] 2016-10-28 22:58:45,307 Validator.java:274 - [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02] Sending completed merkle tree to /52.220.127.190 for myks.xtable ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,308 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:5,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1321) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1211) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionM
[jira] [Updated] (CASSANDRA-12860) Nodetool repair fragile: cannot properly recover from single node failure. Has to restart all nodes in order to repair again
[ https://issues.apache.org/jira/browse/CASSANDRA-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Wu updated CASSANDRA-12860: Description: Summary of symptom: - Set up is a multi-region cluster in AWS (5 regions). Each region has at least 4 hosts with RF=1/2 number of nodes, using V-nodes (256) - How to reproduce: -- On node A, start this repair job (again we are running fresh 3.5.0): {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job starts fine, reporting progress like {noformat} [2016-10-28 22:37:52,692] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:38:35,099] Repair session 36f13450-9d5f-11e6-8bf7-a9f47ff986a9 for range [(4029874034937227774,4033949979656106020]] finished (progress: 1%) [2016-10-28 22:38:38,769] Repair session 36f30910-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-2395606719402271267,-2394525508513518837]] finished (progress: 1%) [2016-10-28 22:38:48,521] Repair session 36f3f370-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-5223108861718702793,-5221117649630514419]] finished (progress: 2%) {noformat} -- Then manually shutdown another node (node B) in the same region (haven't tried with other region yet but expect the same behavior from past experience) -- Shortly after that seeing this message from job log (as well as in system.log) on node A: {noformat} [2016-10-28 22:41:46,268] Repair session 37088ce1-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-928974038666914990,-927967994563261540]] failed with error Endpoint /node_B_ip died (progress: 51%) {noformat} -- From this point on, repair job seems to hang: --- no further messages from job log --- nor any related messages in system.log --- CPU stayed low (low single digit percent of 1 CPU) -- After an hour (1hr), manually kill the repair jobs using "ps -eaf | grep repair" -- Restart C* on node A --- Verified system is up and no error messages in system.log --- Also verified that there is no error messages from node B -- After node A settles down (e.g. no new messages from system.log), restart the same repair job: {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job failes pretty quickly, reporting error from more nodes B and K: {noformat} [y...@cass-tm-1b-012.apse1.mashery.com ~]$ tail -f /tmp/repair.log [2016-10-28 22:49:52,965] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:50:15,839] Repair session e4180720-9d60-11e6-b2f9-cb9524b3c536 for range [(4029874034937227774,4033949979656106020]] failed with error [repair #e4180720-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(4029874034937227774,4033949979656106020]]] Validation failed in /node_K_ip (progress: 1%) [2016-10-28 22:50:17,158] Repair session e419dbe0-9d60-11e6-b2f9-cb9524b3c536 for range [(-2395606719402271267,-2394525508513518837]] failed with error [repair #e419dbe0-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-2395606719402271267,-2394525508513518837]]] Validation failed in /node_B_ip (progress: 1%) [2016-10-28 22:50:18,256] Repair session e41b1460-9d60-11e6-b2f9-cb9524b3c536 for range [(-5223108861718702793,-5221117649630514419]] failed with error [repair #e41b1460-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-5223108861718702793,-5221117649630514419]]] Validation failed in /node_B_ip (progress: 2%) {noformat} -- No the said nodes (B and K), seeing the same error: {noformat} ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 Validator.java:261 - Failed creating a merkle tree for [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02 on myks/atable, [(4029874034937227774,4033949979656106020]]], /52.220.127.190 (see log for details) INFO [AntiEntropyStage:1] 2016-10-28 22:58:45,307 Validator.java:274 - [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02] Sending completed merkle tree to /52.220.127.190 for myks.xtable ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,308 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:5,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1321) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1211) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.Compaction
[jira] [Updated] (CASSANDRA-12860) Nodetool repair fragile: cannot properly recover from single node failure. Has to restart all nodes in order to repair again
[ https://issues.apache.org/jira/browse/CASSANDRA-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Wu updated CASSANDRA-12860: Description: Summary of symptom: - Set up is a multi-region cluster in AWS (5 regions). Each region has at least 4 hosts with RF=1/2 number of nodes, using V-nodes (256) - How to reproduce: -- On node A, start this repair job (again we are running fresh 3.5.0): {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job starts fine, reporting progress like {noformat} [2016-10-28 22:37:52,692] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:38:35,099] Repair session 36f13450-9d5f-11e6-8bf7-a9f47ff986a9 for range [(4029874034937227774,4033949979656106020]] finished (progress: 1%) [2016-10-28 22:38:38,769] Repair session 36f30910-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-2395606719402271267,-2394525508513518837]] finished (progress: 1%) [2016-10-28 22:38:48,521] Repair session 36f3f370-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-5223108861718702793,-5221117649630514419]] finished (progress: 2%) {noformat} -- Then manually shutdown another node (node B) in the same region (haven't tried with other region yet but expect the same behavior from past experience) -- Shortly after that seeing this message from job log (as well as in system.log) on node A: {noformat} [2016-10-28 22:41:46,268] Repair session 37088ce1-9d5f-11e6-8bf7-a9f47ff986a9 for range [(-928974038666914990,-927967994563261540]] failed with error Endpoint /node_B_ip died (progress: 51%) {noformat} -- From this point on, repair job seems to hang: --- no further messages from job log --- nor any related messages in system.log --- CPU stayed low (low single digit percent of 1 CPU) -- After an hour (1hr), manually kill the repair jobs using "ps -eaf | grep repair" -- Restart C* on node A --- Verified system is up and no error messages in system.log --- Also verified that there is no error messages from node B -- After node A settles down (e.g. no new messages from system.log), restart the same repair job: {code}nohup sudo nodetool repair -j 2 -pr -full myks > /tmp/repair.log 2>&1 &{code} -- Job failes pretty quickly, reporting error from more nodes B and K: {noformat} [y...@cass-tm-1b-012.apse1.mashery.com ~]$ tail -f /tmp/repair.log [2016-10-28 22:49:52,965] Starting repair command #1, repairing keyspace myks with repair options (parallelism: parallel, primary range: true, incremental: false, job threads: 2, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:50:15,839] Repair session e4180720-9d60-11e6-b2f9-cb9524b3c536 for range [(4029874034937227774,4033949979656106020]] failed with error [repair #e4180720-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(4029874034937227774,4033949979656106020]]] Validation failed in /node_K_ip (progress: 1%) [2016-10-28 22:50:17,158] Repair session e419dbe0-9d60-11e6-b2f9-cb9524b3c536 for range [(-2395606719402271267,-2394525508513518837]] failed with error [repair #e419dbe0-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-2395606719402271267,-2394525508513518837]]] Validation failed in /node_B_ip (progress: 1%) [2016-10-28 22:50:18,256] Repair session e41b1460-9d60-11e6-b2f9-cb9524b3c536 for range [(-5223108861718702793,-5221117649630514419]] failed with error [repair #e41b1460-9d60-11e6-b2f9-cb9524b3c536 on myks/rtable, [(-5223108861718702793,-5221117649630514419]]] Validation failed in /node_B_ip (progress: 2%) {noformat} -- On the said nodes (B and K) seeing similar errors: {noformat} ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,307 Validator.java:261 - Failed creating a merkle tree for [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02 on oauth2/atokens, [(4029874034937227774,4033949979656106020]]], /node_B_ip (see log for details) INFO [AntiEntropyStage:1] 2016-10-28 22:58:45,307 Validator.java:274 - [repair #14378ec0-9d62-11e6-ab75-cd4d64a01b02] Sending completed merkle tree to /52.220.127.190 for myks.xtable ERROR [ValidationExecutor:5] 2016-10-28 22:58:45,308 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:5,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1321) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1211) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionMana
[jira] [Updated] (CASSANDRA-12860) Nodetool repair fragile: cannot properly recover from single node failure. Has to restart all nodes in order to repair again
[ https://issues.apache.org/jira/browse/CASSANDRA-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Wu updated CASSANDRA-12860: Environment: CentOS 6.7, Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode), Cassandra 3.5.0, fresh install (was: CentOS 7, JDK 8u60, Cassandra 2.2.2 (upgraded from 2.1.5)) Priority: Critical (was: Minor) Description: Summary of symptom: - Set up is a multi-region cluster in AWS (5 regions). Each region has at least 4 hosts with RF=1/2 number of nodes, using V-nodes (256) - How to reproduce: -- On node A, start this repair job (again we are running fresh 3.5.0): {code} sudo nodetool repair -pr my_keyspace > /tmp/repair.log 2>&1 & {code} -- Job starts fine, reporting progress like {noformat} [2016-10-28 21:57:44,427] Repair session 03b6ca61-9d59-11e6-b118-b9abfef3117a for range [(2427717901143689479,2428773541412139342]] finished (progress: 30%){noformat} -- Then manually shutdown another node (node B) in the same region (haven't tried with other region yet but expect the same behavior from past experience) -- Shortly after that seeing this message from job log (as well as in system.log) on node A: {noformat} [2016-10-28 21:59:40,835] Repair session 04000861-9d59-11e6-b118-b9abfef3117a for range [(6981391007853361210,6983870256023436902]] failed with error Endpoint /52.220.127.177 died (progress: 59%) {noformat} -- At this point, repair job seems to hang: --- no further messages from job log --- nor any related messages in system.log --- CPU stayed low (<5%) -- After an hour (1hr), manually kill the repair jobs using "ps -eaf | grep repair" -- Restart C* on node A --- Verified system is up and no error messages in system.log --- Also verified that there is no error messages from node B -- After node A settles down (e.g. no new messages from system.log), restart the same repair job: {code} sudo nodetool repair -pr my_keyspace > /tmp/repair.log 2>&1 & {code} -- Job failes pretty quickly, reporting error from another node K: {noformat} [y...@cass-tm-1b-012.apse1.mashery.com ~]$ tail -f /tmp/repair.log nohup: ignoring input [2016-10-28 22:15:31,770] Starting repair command #1, repairing keyspace my_keyspace with repair options (parallelism: parallel, primary range: true, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256) [2016-10-28 22:15:55,375] Repair session 17b7c390-9d5c-11e6-ba28-61f7d2732e5e for range [(4029874034937227774,4033949979656106020]] failed with error [repair #17b7c390-9d5c-11e6-ba28-61f7d2732e5e on my_keyspace/atable, [(4029874034937227774,4033949979656106020]]] Validation failed in /NodeK (progress: 1%) {noformat} -- Go to node K and tail/view system.log, seeing: {noformat} ERROR [ValidationExecutor:3] 2016-10-28 22:15:55,226 CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:3] 2016-10-28 22:15:55,226 Validator.java:261 - Failed creating a merkle tree for [repair #17b7c390-9d5c-11e6-ba28-61f7d2732e5e on my_keyspace/atable, [(4029874034937227774,4033949979656106020]]], /52.220.127.190 (see log for details) ERROR [ValidationExecutor:3] 2016-10-28 22:15:55,227 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:3,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1321) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1211) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager.access$700(CompactionManager.java:81) ~[apache-cassandra-3.5.0.jar:3.5.0] at org.apache.cassandra.db.compaction.CompactionManager$11.call(CompactionManager.java:841) ~[apache-cassandra-3.5.0.jar:3.5.0] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102] ERROR [ValidationExecutor:3] 2016-10-28 22:15:55,468 CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:3] 2016-10-28 22:15:55,468 Validator.java:261 - Failed creating a merkle tree for [repair #17b7c390-9d5c-11e6-ba28-61f7d2732e5e on my_keyspace/btable, [(4029874034937227774,4033949979656106020]]], /52.220.127.190 (see log for details) ERROR [ValidationExecutor:3] 2016-10-28 22:15:55,469 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:3,1,main] java.lang.RuntimeException
[jira] [Created] (CASSANDRA-12860) CLONE - RepairException: [repair #... on .../..., (...,...]] Validation failed in /w.x.y.z
Bing Wu created CASSANDRA-12860: --- Summary: CLONE - RepairException: [repair #... on .../..., (...,...]] Validation failed in /w.x.y.z Key: CASSANDRA-12860 URL: https://issues.apache.org/jira/browse/CASSANDRA-12860 Project: Cassandra Issue Type: Bug Environment: CentOS 7, JDK 8u60, Cassandra 2.2.2 (upgraded from 2.1.5) Reporter: Bing Wu Priority: Minor Sometimes the repair fails: {code} ERROR [Repair#3:1] 2015-10-14 06:22:56,490 CassandraDaemon.java:185 - Exception in thread Thread[Repair#3:1,5,RMI Runtime] com.google.common.util.concurrent.UncheckedExecutionException: org.apache.cassandra.exceptions.RepairException: [repair #018adc70-723c-11e5-b0d8-6b2151e4d388 on keyspace/table, (2414492737393085601,27880539413409 54029]] Validation failed in /w.y.x.z at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1387) ~[guava-16.0.jar:na] at com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1373) ~[guava-16.0.jar:na] at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:169) ~[apache-cassandra-2.2.2.jar:2.2.2] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_60] at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_60] Caused by: org.apache.cassandra.exceptions.RepairException: [repair #018adc70-723c-11e5-b0d8-6b2151e4d388 on keyspace/table, (2414492737393085601,2788053941340954029]] Validation failed in /w.y.x.z at org.apache.cassandra.repair.ValidationTask.treeReceived(ValidationTask.java:64) ~[apache-cassandra-2.2.2.jar:2.2.2] at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:183) ~[apache-cassandra-2.2.2.jar:2.2.2] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:399) ~[apache-cassandra-2.2.2.jar:2.2.2] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:163) ~[apache-cassandra-2.2.2.jar:2.2.2] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) ~[apache-cassandra-2.2.2.jar:2.2.2] ... 3 common frames omitted {code} And here is the w.y.x.z side: {code} ERROR [ValidationExecutor:7] 2015-10-14 06:22:56,487 CompactionManager.java:1053 - Cannot start multiple repair sessions over the same sstables ERROR [ValidationExecutor:7] 2015-10-14 06:22:56,487 Validator.java:246 - Failed creating a merkle tree for [repair #018adc70-723c-11e5-b0d8-6b2151e4d388 on keyspace/table, (2414492737393085601,2788053941340954029]], /a.b.c.d (see log for details) ERROR [ValidationExecutor:7] 2015-10-14 06:22:56,488 CassandraDaemon.java:185 - Exception in thread Thread[ValidationExecutor:7,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1054) ~[apache-cassandra-2.2.2.jar:2.2.2] at org.apache.cassandra.db.compaction.CompactionManager.access$700(CompactionManager.java:86) ~[apache-cassandra-2.2.2.jar:2.2.2] at org.apache.cassandra.db.compaction.CompactionManager$10.call(CompactionManager.java:652) ~[apache-cassandra-2.2.2.jar:2.2.2] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_60] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] ... ERROR [Reference-Reaper:1] 2015-10-14 06:23:21,439 Ref.java:187 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74fc054a) to class org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@1949471967:/home/cassandra/dsc-cassandra-2.2.2/bin/../data/data/keyspace/table-b15521b062e4bbedcdee5e027297/la-1195-big was not released before the reference was garbage collected {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616594#comment-15616594 ] Arvind Nithrakashyap commented on CASSANDRA-12539: -- There seems to be a very easy way to reproduce this condition - so it seems like Cassandra should be able to handle this case without needing manual intervention at each occurrence. Especially when it gets deployed in an enterprise setting, when services are expected to auto heal. > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfigurationLoader.java:85 - > Configuration location: file:/etc/cassandra/cassandra.yaml > {noformat} > Deleting the empty file fixes the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12689) All MutationStage threads blocked, kills server
[ https://issues.apache.org/jira/browse/CASSANDRA-12689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616518#comment-15616518 ] Tyler Hobbs commented on CASSANDRA-12689: - The new test results look good, so +1, committed to 3.0 as {{d38a732ce15caab57ce6dddb3c0d6a436506db29}} and merged up to 3.X and trunk. Thanks! > All MutationStage threads blocked, kills server > --- > > Key: CASSANDRA-12689 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12689 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benjamin Roth >Assignee: Benjamin Roth >Priority: Critical > Fix For: 3.0.10, 3.10 > > > Under heavy load (e.g. due to repair during normal operations), a lot of > NullPointerExceptions occur in MutationStage. Unfortunately, the log is not > very chatty, trace is missing: > {noformat} > 2016-09-22T06:29:47+00:00 cas6 [MutationStage-1] > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService Uncaught > exception on thread Thread[MutationStage-1,5,main]: {} > 2016-09-22T06:29:47+00:00 cas6 #011java.lang.NullPointerException: null > {noformat} > Then, after some time, in most cases ALL threads in MutationStage pools are > completely blocked. This leads to piling up pending tasks until server runs > OOM and is completely unresponsive due to GC. Threads will NEVER unblock > until server restart. Even if load goes completely down, all hints are > paused, and no compaction or repair is running. Only restart helps. > I can understand that pending tasks in MutationStage may pile up under heavy > load, but tasks should be processed and dequeud after load goes down. This is > definitively not the case. This looks more like a an unhandled exception > leading to a stuck lock. > Stack trace from jconsole, all Threads in MutationStage show same trace. > {noformat} > Name: MutationStage-48 > State: WAITING on java.util.concurrent.CompletableFuture$Signaller@fcc8266 > Total blocked: 137 Total waited: 138.513 > {noformat} > Stack trace: > {noformat} > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) > java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137) > org.apache.cassandra.db.Mutation.apply(Mutation.java:227) > org.apache.cassandra.db.Mutation.apply(Mutation.java:241) > org.apache.cassandra.hints.Hint.apply(Hint.java:96) > org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:91) > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-12689) All MutationStage threads blocked, kills server
[ https://issues.apache.org/jira/browse/CASSANDRA-12689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs resolved CASSANDRA-12689. - Resolution: Fixed Fix Version/s: (was: 3.0.x) (was: 3.x) 3.10 3.0.10 > All MutationStage threads blocked, kills server > --- > > Key: CASSANDRA-12689 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12689 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benjamin Roth >Assignee: Benjamin Roth >Priority: Critical > Fix For: 3.0.10, 3.10 > > > Under heavy load (e.g. due to repair during normal operations), a lot of > NullPointerExceptions occur in MutationStage. Unfortunately, the log is not > very chatty, trace is missing: > {noformat} > 2016-09-22T06:29:47+00:00 cas6 [MutationStage-1] > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService Uncaught > exception on thread Thread[MutationStage-1,5,main]: {} > 2016-09-22T06:29:47+00:00 cas6 #011java.lang.NullPointerException: null > {noformat} > Then, after some time, in most cases ALL threads in MutationStage pools are > completely blocked. This leads to piling up pending tasks until server runs > OOM and is completely unresponsive due to GC. Threads will NEVER unblock > until server restart. Even if load goes completely down, all hints are > paused, and no compaction or repair is running. Only restart helps. > I can understand that pending tasks in MutationStage may pile up under heavy > load, but tasks should be processed and dequeud after load goes down. This is > definitively not the case. This looks more like a an unhandled exception > leading to a stuck lock. > Stack trace from jconsole, all Threads in MutationStage show same trace. > {noformat} > Name: MutationStage-48 > State: WAITING on java.util.concurrent.CompletableFuture$Signaller@fcc8266 > Total blocked: 137 Total waited: 138.513 > {noformat} > Stack trace: > {noformat} > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) > java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137) > org.apache.cassandra.db.Mutation.apply(Mutation.java:227) > org.apache.cassandra.db.Mutation.apply(Mutation.java:241) > org.apache.cassandra.hints.Hint.apply(Hint.java:96) > org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:91) > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/6] cassandra git commit: Avoid deadlock due to MV lock contention
Avoid deadlock due to MV lock contention Patch by Benjamin Roth; reviewed by Tyler Hobbs for CASSANDRA-12689 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d38a732c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d38a732c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d38a732c Branch: refs/heads/trunk Commit: d38a732ce15caab57ce6dddb3c0d6a436506db29 Parents: e4f840a Author: brstgt Authored: Fri Oct 28 15:39:03 2016 -0500 Committer: Tyler Hobbs Committed: Fri Oct 28 15:39:03 2016 -0500 -- CHANGES.txt | 1 + .../cassandra/config/DatabaseDescriptor.java| 2 +- src/java/org/apache/cassandra/db/Keyspace.java | 107 ++- src/java/org/apache/cassandra/db/Mutation.java | 12 +-- .../cassandra/service/paxos/PaxosState.java | 2 +- 5 files changed, 87 insertions(+), 37 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index bf1e7d6..c80e045 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.10 + * Avoid deadlock due to materialized view lock contention (CASSANDRA-12689) * Fix for KeyCacheCqlTest flakiness (CASSANDRA-12801) * Include SSTable filename in compacting large row message (CASSANDRA-12384) * Fix potential socket leak (CASSANDRA-12329, CASSANDRA-12330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index baea210..7b32a34 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -392,7 +392,7 @@ public class DatabaseDescriptor throw new ConfigurationException("concurrent_reads must be at least 2, but was " + conf.concurrent_reads, false); } -if (conf.concurrent_writes != null && conf.concurrent_writes < 2) +if (conf.concurrent_writes != null && conf.concurrent_writes < 2 && System.getProperty("cassandra.test.fail_mv_locks_count", "").isEmpty()) { throw new ConfigurationException("concurrent_writes must be at least 2, but was " + conf.concurrent_writes, false); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/src/java/org/apache/cassandra/db/Keyspace.java -- diff --git a/src/java/org/apache/cassandra/db/Keyspace.java b/src/java/org/apache/cassandra/db/Keyspace.java index 8d710d1..75aab8f 100644 --- a/src/java/org/apache/cassandra/db/Keyspace.java +++ b/src/java/org/apache/cassandra/db/Keyspace.java @@ -63,6 +63,7 @@ public class Keyspace private static final String TEST_FAIL_WRITES_KS = System.getProperty("cassandra.test.fail_writes_ks", ""); private static final boolean TEST_FAIL_WRITES = !TEST_FAIL_WRITES_KS.isEmpty(); +private static int TEST_FAIL_MV_LOCKS_COUNT = Integer.getInteger(System.getProperty("cassandra.test.fail_mv_locks_count", "0"), 0); public final KeyspaceMetrics metric; @@ -384,6 +385,20 @@ public class Keyspace return apply(mutation, writeCommitLog, true, false, null); } +/** + * Should be used if caller is blocking and runs in mutation stage. + * Otherwise there is a race condition where ALL mutation workers are beeing blocked ending + * in a complete deadlock of the mutation stage. See CASSANDRA-12689. + * + * @param mutation + * @param writeCommitLog + * @return + */ +public CompletableFuture applyNotDeferrable(Mutation mutation, boolean writeCommitLog) +{ +return apply(mutation, writeCommitLog, true, false, false, null); +} + public CompletableFuture apply(Mutation mutation, boolean writeCommitLog, boolean updateIndexes) { return apply(mutation, writeCommitLog, updateIndexes, false, null); @@ -394,6 +409,15 @@ public class Keyspace return apply(mutation, false, true, true, null); } +public CompletableFuture apply(final Mutation mutation, + final boolean writeCommitLog, + boolean updateIndexes, + boolean isClReplay, + CompletableFuture future) +{ +return apply(mutation, writeCommitLog, updateIndexes, isClReplay, true, future); +} + /** * This method appends a row to the glo
[2/6] cassandra git commit: Avoid deadlock due to MV lock contention
Avoid deadlock due to MV lock contention Patch by Benjamin Roth; reviewed by Tyler Hobbs for CASSANDRA-12689 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d38a732c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d38a732c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d38a732c Branch: refs/heads/cassandra-3.X Commit: d38a732ce15caab57ce6dddb3c0d6a436506db29 Parents: e4f840a Author: brstgt Authored: Fri Oct 28 15:39:03 2016 -0500 Committer: Tyler Hobbs Committed: Fri Oct 28 15:39:03 2016 -0500 -- CHANGES.txt | 1 + .../cassandra/config/DatabaseDescriptor.java| 2 +- src/java/org/apache/cassandra/db/Keyspace.java | 107 ++- src/java/org/apache/cassandra/db/Mutation.java | 12 +-- .../cassandra/service/paxos/PaxosState.java | 2 +- 5 files changed, 87 insertions(+), 37 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index bf1e7d6..c80e045 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.10 + * Avoid deadlock due to materialized view lock contention (CASSANDRA-12689) * Fix for KeyCacheCqlTest flakiness (CASSANDRA-12801) * Include SSTable filename in compacting large row message (CASSANDRA-12384) * Fix potential socket leak (CASSANDRA-12329, CASSANDRA-12330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index baea210..7b32a34 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -392,7 +392,7 @@ public class DatabaseDescriptor throw new ConfigurationException("concurrent_reads must be at least 2, but was " + conf.concurrent_reads, false); } -if (conf.concurrent_writes != null && conf.concurrent_writes < 2) +if (conf.concurrent_writes != null && conf.concurrent_writes < 2 && System.getProperty("cassandra.test.fail_mv_locks_count", "").isEmpty()) { throw new ConfigurationException("concurrent_writes must be at least 2, but was " + conf.concurrent_writes, false); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/src/java/org/apache/cassandra/db/Keyspace.java -- diff --git a/src/java/org/apache/cassandra/db/Keyspace.java b/src/java/org/apache/cassandra/db/Keyspace.java index 8d710d1..75aab8f 100644 --- a/src/java/org/apache/cassandra/db/Keyspace.java +++ b/src/java/org/apache/cassandra/db/Keyspace.java @@ -63,6 +63,7 @@ public class Keyspace private static final String TEST_FAIL_WRITES_KS = System.getProperty("cassandra.test.fail_writes_ks", ""); private static final boolean TEST_FAIL_WRITES = !TEST_FAIL_WRITES_KS.isEmpty(); +private static int TEST_FAIL_MV_LOCKS_COUNT = Integer.getInteger(System.getProperty("cassandra.test.fail_mv_locks_count", "0"), 0); public final KeyspaceMetrics metric; @@ -384,6 +385,20 @@ public class Keyspace return apply(mutation, writeCommitLog, true, false, null); } +/** + * Should be used if caller is blocking and runs in mutation stage. + * Otherwise there is a race condition where ALL mutation workers are beeing blocked ending + * in a complete deadlock of the mutation stage. See CASSANDRA-12689. + * + * @param mutation + * @param writeCommitLog + * @return + */ +public CompletableFuture applyNotDeferrable(Mutation mutation, boolean writeCommitLog) +{ +return apply(mutation, writeCommitLog, true, false, false, null); +} + public CompletableFuture apply(Mutation mutation, boolean writeCommitLog, boolean updateIndexes) { return apply(mutation, writeCommitLog, updateIndexes, false, null); @@ -394,6 +409,15 @@ public class Keyspace return apply(mutation, false, true, true, null); } +public CompletableFuture apply(final Mutation mutation, + final boolean writeCommitLog, + boolean updateIndexes, + boolean isClReplay, + CompletableFuture future) +{ +return apply(mutation, writeCommitLog, updateIndexes, isClReplay, true, future); +} + /** * This method appends a row to
[6/6] cassandra git commit: Merge branch 'cassandra-3.X' into trunk
Merge branch 'cassandra-3.X' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6f1ce682 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6f1ce682 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6f1ce682 Branch: refs/heads/trunk Commit: 6f1ce6823088e9eeb28f6928bed44050024a1f25 Parents: 9e8b7c0 0a1f1c8 Author: Tyler Hobbs Authored: Fri Oct 28 15:41:12 2016 -0500 Committer: Tyler Hobbs Committed: Fri Oct 28 15:41:12 2016 -0500 -- CHANGES.txt | 1 + .../cassandra/config/DatabaseDescriptor.java| 2 +- src/java/org/apache/cassandra/db/Keyspace.java | 110 ++- src/java/org/apache/cassandra/db/Mutation.java | 12 +- .../cassandra/service/paxos/PaxosState.java | 2 +- 5 files changed, 92 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f1ce682/CHANGES.txt -- diff --cc CHANGES.txt index a6410d9,82d3d9c..264f8d5 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -102,8 -94,9 +102,9 @@@ * Remove pre-startup check for open JMX port (CASSANDRA-12074) * Remove compaction Severity from DynamicEndpointSnitch (CASSANDRA-11738) * Restore resumable hints delivery (CASSANDRA-11960) - * Properly report LWT contention (CASSANDRA-12626) + * Properly record CAS contention (CASSANDRA-12626) Merged from 3.0: + * Avoid deadlock due to MV lock contention (CASSANDRA-12689) * Fix for KeyCacheCqlTest flakiness (CASSANDRA-12801) * Include SSTable filename in compacting large row message (CASSANDRA-12384) * Fix potential socket leak (CASSANDRA-12329, CASSANDRA-12330)
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.X
Merge branch 'cassandra-3.0' into cassandra-3.X Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0a1f1c81 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0a1f1c81 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0a1f1c81 Branch: refs/heads/cassandra-3.X Commit: 0a1f1c81e641039ca9fd573d5217b6b6f2ad8fb8 Parents: 9be467a d38a732 Author: Tyler Hobbs Authored: Fri Oct 28 15:41:02 2016 -0500 Committer: Tyler Hobbs Committed: Fri Oct 28 15:41:02 2016 -0500 -- CHANGES.txt | 1 + .../cassandra/config/DatabaseDescriptor.java| 2 +- src/java/org/apache/cassandra/db/Keyspace.java | 110 ++- src/java/org/apache/cassandra/db/Mutation.java | 12 +- .../cassandra/service/paxos/PaxosState.java | 2 +- 5 files changed, 92 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a1f1c81/CHANGES.txt -- diff --cc CHANGES.txt index bbd6f00,c80e045..82d3d9c --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,101 -1,5 +1,102 @@@ -3.0.10 - * Avoid deadlock due to materialized view lock contention (CASSANDRA-12689) +3.10 + * Set JOINING mode when running pre-join tasks (CASSANDRA-12836) + * remove net.mintern.primitive library due to license issue (CASSANDRA-12845) + * Properly format IPv6 addresses when logging JMX service URL (CASSANDRA-12454) + * Optimize the vnode allocation for single replica per DC (CASSANDRA-12777) + * Use non-token restrictions for bounds when token restrictions are overridden (CASSANDRA-12419) + * Fix CQLSH auto completion for PER PARTITION LIMIT (CASSANDRA-12803) + * Use different build directories for Eclipse and Ant (CASSANDRA-12466) + * Avoid potential AttributeError in cqlsh due to no table metadata (CASSANDRA-12815) + * Fix RandomReplicationAwareTokenAllocatorTest.testExistingCluster (CASSANDRA-12812) + * Upgrade commons-codec to 1.9 (CASSANDRA-12790) + * Make the fanout size for LeveledCompactionStrategy to be configurable (CASSANDRA-11550) + * Add duration data type (CASSANDRA-11873) + * Fix timeout in ReplicationAwareTokenAllocatorTest (CASSANDRA-12784) + * Improve sum aggregate functions (CASSANDRA-12417) + * Make cassandra.yaml docs for batch_size_*_threshold_in_kb reflect changes in CASSANDRA-10876 (CASSANDRA-12761) + * cqlsh fails to format collections when using aliases (CASSANDRA-11534) + * Check for hash conflicts in prepared statements (CASSANDRA-12733) + * Exit query parsing upon first error (CASSANDRA-12598) + * Fix cassandra-stress to use single seed in UUID generation (CASSANDRA-12729) + * CQLSSTableWriter does not allow Update statement (CASSANDRA-12450) + * Config class uses boxed types but DD exposes primitive types (CASSANDRA-12199) + * Add pre- and post-shutdown hooks to Storage Service (CASSANDRA-12461) + * Add hint delivery metrics (CASSANDRA-12693) + * Remove IndexInfo cache from FileIndexInfoRetriever (CASSANDRA-12731) + * ColumnIndex does not reuse buffer (CASSANDRA-12502) + * cdc column addition still breaks schema migration tasks (CASSANDRA-12697) + * Upgrade metrics-reporter dependencies (CASSANDRA-12089) + * Tune compaction thread count via nodetool (CASSANDRA-12248) + * Add +=/-= shortcut syntax for update queries (CASSANDRA-12232) + * Include repair session IDs in repair start message (CASSANDRA-12532) + * Add a blocking task to Index, run before joining the ring (CASSANDRA-12039) + * Fix NPE when using CQLSSTableWriter (CASSANDRA-12667) + * Support optional backpressure strategies at the coordinator (CASSANDRA-9318) + * Make randompartitioner work with new vnode allocation (CASSANDRA-12647) + * Fix cassandra-stress graphing (CASSANDRA-12237) + * Allow filtering on partition key columns for queries without secondary indexes (CASSANDRA-11031) + * Fix Cassandra Stress reporting thread model and precision (CASSANDRA-12585) + * Add JMH benchmarks.jar (CASSANDRA-12586) + * Add row offset support to SASI (CASSANDRA-11990) + * Cleanup uses of AlterTableStatementColumn (CASSANDRA-12567) + * Add keep-alive to streaming (CASSANDRA-11841) + * Tracing payload is passed through newSession(..) (CASSANDRA-11706) + * avoid deleting non existing sstable files and improve related log messages (CASSANDRA-12261) + * json/yaml output format for nodetool compactionhistory (CASSANDRA-12486) + * Retry all internode messages once after a connection is + closed and reopened (CASSANDRA-12192) + * Add support to rebuild from targeted replica (CASSANDRA-9875) + * Add sequence distribution type to cassandra stress (CASSANDRA-12490) + * "SELECT * FROM foo LIMIT ;" does not error out (CASSANDRA-12154) + * Define executeLocally() at the Re
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.X
Merge branch 'cassandra-3.0' into cassandra-3.X Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0a1f1c81 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0a1f1c81 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0a1f1c81 Branch: refs/heads/trunk Commit: 0a1f1c81e641039ca9fd573d5217b6b6f2ad8fb8 Parents: 9be467a d38a732 Author: Tyler Hobbs Authored: Fri Oct 28 15:41:02 2016 -0500 Committer: Tyler Hobbs Committed: Fri Oct 28 15:41:02 2016 -0500 -- CHANGES.txt | 1 + .../cassandra/config/DatabaseDescriptor.java| 2 +- src/java/org/apache/cassandra/db/Keyspace.java | 110 ++- src/java/org/apache/cassandra/db/Mutation.java | 12 +- .../cassandra/service/paxos/PaxosState.java | 2 +- 5 files changed, 92 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a1f1c81/CHANGES.txt -- diff --cc CHANGES.txt index bbd6f00,c80e045..82d3d9c --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,101 -1,5 +1,102 @@@ -3.0.10 - * Avoid deadlock due to materialized view lock contention (CASSANDRA-12689) +3.10 + * Set JOINING mode when running pre-join tasks (CASSANDRA-12836) + * remove net.mintern.primitive library due to license issue (CASSANDRA-12845) + * Properly format IPv6 addresses when logging JMX service URL (CASSANDRA-12454) + * Optimize the vnode allocation for single replica per DC (CASSANDRA-12777) + * Use non-token restrictions for bounds when token restrictions are overridden (CASSANDRA-12419) + * Fix CQLSH auto completion for PER PARTITION LIMIT (CASSANDRA-12803) + * Use different build directories for Eclipse and Ant (CASSANDRA-12466) + * Avoid potential AttributeError in cqlsh due to no table metadata (CASSANDRA-12815) + * Fix RandomReplicationAwareTokenAllocatorTest.testExistingCluster (CASSANDRA-12812) + * Upgrade commons-codec to 1.9 (CASSANDRA-12790) + * Make the fanout size for LeveledCompactionStrategy to be configurable (CASSANDRA-11550) + * Add duration data type (CASSANDRA-11873) + * Fix timeout in ReplicationAwareTokenAllocatorTest (CASSANDRA-12784) + * Improve sum aggregate functions (CASSANDRA-12417) + * Make cassandra.yaml docs for batch_size_*_threshold_in_kb reflect changes in CASSANDRA-10876 (CASSANDRA-12761) + * cqlsh fails to format collections when using aliases (CASSANDRA-11534) + * Check for hash conflicts in prepared statements (CASSANDRA-12733) + * Exit query parsing upon first error (CASSANDRA-12598) + * Fix cassandra-stress to use single seed in UUID generation (CASSANDRA-12729) + * CQLSSTableWriter does not allow Update statement (CASSANDRA-12450) + * Config class uses boxed types but DD exposes primitive types (CASSANDRA-12199) + * Add pre- and post-shutdown hooks to Storage Service (CASSANDRA-12461) + * Add hint delivery metrics (CASSANDRA-12693) + * Remove IndexInfo cache from FileIndexInfoRetriever (CASSANDRA-12731) + * ColumnIndex does not reuse buffer (CASSANDRA-12502) + * cdc column addition still breaks schema migration tasks (CASSANDRA-12697) + * Upgrade metrics-reporter dependencies (CASSANDRA-12089) + * Tune compaction thread count via nodetool (CASSANDRA-12248) + * Add +=/-= shortcut syntax for update queries (CASSANDRA-12232) + * Include repair session IDs in repair start message (CASSANDRA-12532) + * Add a blocking task to Index, run before joining the ring (CASSANDRA-12039) + * Fix NPE when using CQLSSTableWriter (CASSANDRA-12667) + * Support optional backpressure strategies at the coordinator (CASSANDRA-9318) + * Make randompartitioner work with new vnode allocation (CASSANDRA-12647) + * Fix cassandra-stress graphing (CASSANDRA-12237) + * Allow filtering on partition key columns for queries without secondary indexes (CASSANDRA-11031) + * Fix Cassandra Stress reporting thread model and precision (CASSANDRA-12585) + * Add JMH benchmarks.jar (CASSANDRA-12586) + * Add row offset support to SASI (CASSANDRA-11990) + * Cleanup uses of AlterTableStatementColumn (CASSANDRA-12567) + * Add keep-alive to streaming (CASSANDRA-11841) + * Tracing payload is passed through newSession(..) (CASSANDRA-11706) + * avoid deleting non existing sstable files and improve related log messages (CASSANDRA-12261) + * json/yaml output format for nodetool compactionhistory (CASSANDRA-12486) + * Retry all internode messages once after a connection is + closed and reopened (CASSANDRA-12192) + * Add support to rebuild from targeted replica (CASSANDRA-9875) + * Add sequence distribution type to cassandra stress (CASSANDRA-12490) + * "SELECT * FROM foo LIMIT ;" does not error out (CASSANDRA-12154) + * Define executeLocally() at the ReadQuery
[1/6] cassandra git commit: Avoid deadlock due to MV lock contention
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 e4f840aa1 -> d38a732ce refs/heads/cassandra-3.X 9be467a22 -> 0a1f1c81e refs/heads/trunk 9e8b7c0d0 -> 6f1ce6823 Avoid deadlock due to MV lock contention Patch by Benjamin Roth; reviewed by Tyler Hobbs for CASSANDRA-12689 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d38a732c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d38a732c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d38a732c Branch: refs/heads/cassandra-3.0 Commit: d38a732ce15caab57ce6dddb3c0d6a436506db29 Parents: e4f840a Author: brstgt Authored: Fri Oct 28 15:39:03 2016 -0500 Committer: Tyler Hobbs Committed: Fri Oct 28 15:39:03 2016 -0500 -- CHANGES.txt | 1 + .../cassandra/config/DatabaseDescriptor.java| 2 +- src/java/org/apache/cassandra/db/Keyspace.java | 107 ++- src/java/org/apache/cassandra/db/Mutation.java | 12 +-- .../cassandra/service/paxos/PaxosState.java | 2 +- 5 files changed, 87 insertions(+), 37 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index bf1e7d6..c80e045 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.10 + * Avoid deadlock due to materialized view lock contention (CASSANDRA-12689) * Fix for KeyCacheCqlTest flakiness (CASSANDRA-12801) * Include SSTable filename in compacting large row message (CASSANDRA-12384) * Fix potential socket leak (CASSANDRA-12329, CASSANDRA-12330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index baea210..7b32a34 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -392,7 +392,7 @@ public class DatabaseDescriptor throw new ConfigurationException("concurrent_reads must be at least 2, but was " + conf.concurrent_reads, false); } -if (conf.concurrent_writes != null && conf.concurrent_writes < 2) +if (conf.concurrent_writes != null && conf.concurrent_writes < 2 && System.getProperty("cassandra.test.fail_mv_locks_count", "").isEmpty()) { throw new ConfigurationException("concurrent_writes must be at least 2, but was " + conf.concurrent_writes, false); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/d38a732c/src/java/org/apache/cassandra/db/Keyspace.java -- diff --git a/src/java/org/apache/cassandra/db/Keyspace.java b/src/java/org/apache/cassandra/db/Keyspace.java index 8d710d1..75aab8f 100644 --- a/src/java/org/apache/cassandra/db/Keyspace.java +++ b/src/java/org/apache/cassandra/db/Keyspace.java @@ -63,6 +63,7 @@ public class Keyspace private static final String TEST_FAIL_WRITES_KS = System.getProperty("cassandra.test.fail_writes_ks", ""); private static final boolean TEST_FAIL_WRITES = !TEST_FAIL_WRITES_KS.isEmpty(); +private static int TEST_FAIL_MV_LOCKS_COUNT = Integer.getInteger(System.getProperty("cassandra.test.fail_mv_locks_count", "0"), 0); public final KeyspaceMetrics metric; @@ -384,6 +385,20 @@ public class Keyspace return apply(mutation, writeCommitLog, true, false, null); } +/** + * Should be used if caller is blocking and runs in mutation stage. + * Otherwise there is a race condition where ALL mutation workers are beeing blocked ending + * in a complete deadlock of the mutation stage. See CASSANDRA-12689. + * + * @param mutation + * @param writeCommitLog + * @return + */ +public CompletableFuture applyNotDeferrable(Mutation mutation, boolean writeCommitLog) +{ +return apply(mutation, writeCommitLog, true, false, false, null); +} + public CompletableFuture apply(Mutation mutation, boolean writeCommitLog, boolean updateIndexes) { return apply(mutation, writeCommitLog, updateIndexes, false, null); @@ -394,6 +409,15 @@ public class Keyspace return apply(mutation, false, true, true, null); } +public CompletableFuture apply(final Mutation mutation, + final boolean writeCommitLog, + boolean updateIndexes, + boolean isClReplay, +
[jira] [Commented] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616206#comment-15616206 ] Jeff Jirsa commented on CASSANDRA-12859: Note also: CASSANDRA-8303 > Column-level permissions > > > Key: CASSANDRA-12859 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12859 > Project: Cassandra > Issue Type: New Feature > Components: Core, CQL >Reporter: Boris Melamed > Attachments: Cassandra Proposal - Column-level permissions.docx > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. Here is a draft of: > Cassandra Proposal - Column-level permissions.docx (attached) > h4. Quoting the 'Overview' section: > The purpose of this proposal is to add column-level (field-level) permissions > to Cassandra. It is my intent to soon start implementing this feature in a > fork, and to submit a pull request once it’s ready. > h4. Motivation > Cassandra already supports permissions on keyspace and table (column family) > level. Sources: > * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra > * https://cassandra.apache.org/doc/latest/cql/security.html#data-control > At IBM, we have use cases in the area of big data analytics where > column-level access permissions are also a requirement. All industry RDBMS > products are supporting this level of permission control, and regulators are > expecting it from all data-based systems. > h4. Main day-one requirements > # Extend CQL (Cassandra Query Language) to be able to optionally specify a > list of individual columns, in the {{GRANT}} statement. The relevant > permission types are: {{MODIFY}} (for {{UPDATE}} and {{INSERT}}) and > {{SELECT}}. > # Persist the optional information in the appropriate system table > ‘system_auth.role_permissions’. > # Enforce the column access restrictions during execution. Details: > #* Should fit with the existing permission propagation down a role chain. > #* Proposed message format when a user’s roles give access to the queried > table but not to all of the selected, inserted, or updated columns: > "User %s has no %s permission on column %s of table %s" > #* Error will report only the first checked column. > Nice to have: list all inaccessible columns. > #* Error code is the same as for table access denial: 2100. > h4. Additional day-one requirements > # Reflect the column-level permissions in statements of type > {{LIST ALL PERMISSIONS OF someuser;}} > # Performance should not degrade in any significant way. > # Backwards compatibility > #* Permission enforcement for DBs created before the upgrade should continue > to work with the same behavior after upgrading to a version that allows > column-level permissions. > #* Previous CQL syntax will remain valid, and have the same effect as before. > h4. Documentation > * > https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission > * Feedback request: any others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12838) Extend native protocol flags and add supported versions to the SUPPORTED response
[ https://issues.apache.org/jira/browse/CASSANDRA-12838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616194#comment-15616194 ] Adam Holmberg commented on CASSANDRA-12838: --- bq. can we arrange a time slot next week, morning your time, where we can commit this ticket and simultaneously merge the pull request into cassandra-test Yes, I have merged and tested on `cassandra-test` branch, will wait for your signal. Please let me know what day/time you would like to coordinate. > Extend native protocol flags and add supported versions to the SUPPORTED > response > - > > Key: CASSANDRA-12838 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12838 > Project: Cassandra > Issue Type: Sub-task > Components: CQL >Reporter: Stefania >Assignee: Stefania > Labels: client-impacting > Fix For: 3.x > > > We already use 7 bits for the flags of the QUERY message, and since they are > encoded with a fixed size byte, we may be forced to change the structure of > the message soon, and I'd like to do this in version 5 but without wasting > bytes on the wire. Therefore, I propose to convert fixed flag's bytes to > unsigned vints, as defined in CASSANDRA-9499. The only exception would be the > flags in the frame, which should stay as fixed size. > Up to 7 bits, vints are encoded the same as bytes are, so no immediate change > would be required in the drivers, although they should plan to support vint > flags if supporting version 5. Moving forward, when a new flag is required > for the QUERY message, and eventually when other flags reach 8 bits in other > messages too, the flag's bitmaps would be automatically encoded with a size > that is big enough to accommodate all flags, but no bigger than required. We > can currently support up to 8 bytes with unsigned vints. > The downside is that drivers need to implement unsigned vint encoding for > version 5, but this is already required by CASSANDRA-11873, and will most > likely be required by CASSANDRA-11622 as well. > I would also like to add the list of versions to the SUPPORTED message, in > order to simplify the handshake for drivers that prefer to send an OPTION > message, rather than rely on receiving an error for an unsupported version in > the STARTUP message. Said error should also contain the full list of > supported versions, not just the min and max, for clarity, and because the > latest version is now a beta version. > Finally, we currently store versions as integer constants in {{Server.java}}, > and we still have a fair bit of hard-coded numbers in the code, especially in > tests. I plan to clean this up by introducing a {{ProtocolVersion}} enum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-12859: -- Description: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the {{GRANT}} statement. The relevant permission types are: {{MODIFY}} (for {{UPDATE}} and {{INSERT}}) and {{SELECT}}. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. h4. Additional day-one requirements # Reflect the column-level permissions in statements of type {{LIST ALL PERMISSIONS OF someuser;}} # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation * https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission * Feedback request: any others? was: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the {{GRANT}} statement. The relevant permission types are: {{MODIFY}} (for {{UPDATE}} and {{INSERT}}) and {{SELECT}}. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type {{LIST ALL PERMISSIONS OF someuser;}} # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation * https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission * Feedback request: any others? > Column-level permissions > > > Key: CASSANDRA-12859 >
svn commit: r1767051 - in /cassandra/site: publish/index.html src/index.html
Author: mshuler Date: Fri Oct 28 18:06:24 2016 New Revision: 1767051 URL: http://svn.apache.org/viewvc?rev=1767051&view=rev Log: Fix typo Modified: cassandra/site/publish/index.html cassandra/site/src/index.html Modified: cassandra/site/publish/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1767051&r1=1767050&r2=1767051&view=diff == --- cassandra/site/publish/index.html (original) +++ cassandra/site/publish/index.html Fri Oct 28 18:06:24 2016 @@ -169,7 +169,7 @@ Some of the largest production deployments include Apple's, with over 75,000 nodes storing over 10 PB of data, Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, -over 800 million reqests per day), and eBay (over 100 nodes, 250 TB). +over 800 million requests per day), and eBay (over 100 nodes, 250 TB). Modified: cassandra/site/src/index.html URL: http://svn.apache.org/viewvc/cassandra/site/src/index.html?rev=1767051&r1=1767050&r2=1767051&view=diff == --- cassandra/site/src/index.html (original) +++ cassandra/site/src/index.html Fri Oct 28 18:06:24 2016 @@ -69,7 +69,7 @@ is_homepage: true Some of the largest production deployments include Apple's, with over 75,000 nodes storing over 10 PB of data, Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, -over 800 million reqests per day), and eBay (over 100 nodes, 250 TB). +over 800 million requests per day), and eBay (over 100 nodes, 250 TB).
[jira] [Updated] (CASSANDRA-12462) NullPointerException in CompactionInfo.getId(CompactionInfo.java:65)
[ https://issues.apache.org/jira/browse/CASSANDRA-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-12462: --- Reviewer: Yuki Morishita (was: Jonathan Ellis) > NullPointerException in CompactionInfo.getId(CompactionInfo.java:65) > > > Key: CASSANDRA-12462 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12462 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jonathan DePrizio > Attachments: > 0001-Fix-NPE-when-running-nodetool-compactionstats.patch, > CASSANDRA-12462-v2.patch > > > Note: The same trace is cited in the last comment of > https://issues.apache.org/jira/browse/CASSANDRA-11961 > I've noticed that some of my nodes in my 2.1 cluster have fallen way behind > on compactions, and have huge numbers (thousands) of uncompacted, tiny > SSTables (~30MB or so). > In diagnosing the issue, I've found that "nodetool compactionstats" returns > the exception below. Restarting cassandra on the node here causes the > pending tasks count to jump to ~2000. Compactions run properly for about an > hour, until this exception occurs again. Once it occurs, I see the pending > tasks value rapidly drop towards zero, but without any compactions actually > running (the logs show no compactions finishing). It would seem that this is > causing compactions to fail on this node, which is leading to it running out > of space, etc. > [redacted]# nodetool compactionstats > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms12G -Xmx12G > -Xmn1000M -Xss255k > pending tasks: 5 > error: null > -- StackTrace -- > java.lang.NullPointerException > at > org.apache.cassandra.db.compaction.CompactionInfo.getId(CompactionInfo.java:65) > at > org.apache.cassandra.db.compaction.CompactionInfo.asMap(CompactionInfo.java:118) > at > org.apache.cassandra.db.compaction.CompactionManager.getCompactions(CompactionManager.java:1405) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at sun.reflect.misc.Trampoline.invoke(Unknown Source) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at sun.reflect.misc.MethodUtil.invoke(Unknown Source) > at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown > Source) > at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown > Source) > at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source) > at com.sun.jmx.mbeanserver.PerInterface.getAttribute(Unknown Source) > at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(Unknown Source) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(Unknown > Source) > at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(Unknown Source) > at javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown > Source) > at javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown > Source) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown > Source) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown > Source) > at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(Unknown > Source) > at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source) > at sun.rmi.transport.Transport$1.run(Unknown Source) > at sun.rmi.transport.Transport$1.run(Unknown Source) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Unknown Source) > at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown > Source) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-12859: -- Description: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the {{GRANT}} statement. The relevant permission types are: {{MODIFY}} (for {{UPDATE}} and {{INSERT}}) and {{SELECT}}. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type {{LIST ALL PERMISSIONS OF someuser;}} # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation * https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission * Feedback request: any others? was: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation * https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission * Feedback request: any others? > Column-level permissions > > > Key: CASSANDRA-12859 > URL: https://issues.apache.
[jira] [Updated] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-12859: -- Description: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation #* https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission #* Feedback request: any others? was: Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: - http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra - https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. Main day-one requirements 1. Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. 2. Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. 3. Enforce the column access restrictions during execution. Details: a.Should fit with the existing permission propagation down a role chain. b.Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" c.Error will report only the first checked column. Nice to have: list all inaccessible columns. d.Error code is the same as for table access denial: 2100. Additional day-one requirements 4. Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; 5. Performance should not degrade in any significant way. 6. Backwards compatibility a.Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. b.Previous CQL syntax will remain valid, and have the same effect as before. 7. Documentation o https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission o Feedback request: any others? > Column-level permissions > > > Key: CASSAN
[jira] [Updated] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-12859: -- Description: Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: - http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra - https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. Main day-one requirements 1. Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. 2. Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. 3. Enforce the column access restrictions during execution. Details: a.Should fit with the existing permission propagation down a role chain. b.Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" c.Error will report only the first checked column. Nice to have: list all inaccessible columns. d.Error code is the same as for table access denial: 2100. Additional day-one requirements 4. Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; 5. Performance should not degrade in any significant way. 6. Backwards compatibility a.Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. b.Previous CQL syntax will remain valid, and have the same effect as before. 7. Documentation o https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission o Feedback request: any others? was: Here is a draft of: Cassandra Proposal - Column-level permissions.docx https://ibm.box.com/s/ithyzt0bhlcfb49dl5x6us0c887p1ovw Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: - http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra - https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. Main day-one requirements 1. Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. 2. Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. 3. Enforce the column access restrictions during execution. Details: a.Should fit with the existing permission propagation down a role chain. b.Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" c.Error will report only the first checked column. Nice to have: list all inaccessible columns. d.Error code is the same as for table access denial: 2100. Additional day-one requirements 4. Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; 5. Performance should not degrade in any significant way. 6. Backwards compatibility a.Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. b.Previous CQL syntax will remain valid, and have the same effect as before. 7. Documentation o https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission o
[jira] [Updated] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-12859: -- Description: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation * https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission * Feedback request: any others? was: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation * https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission * Feedback request: any others? > Column-level permissions > > > Key: CASSANDRA-12859 > URL: https://issues.apache.org/jira/browse/CASSANDRA
[jira] [Updated] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-12859: -- Description: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation * https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission * Feedback request: any others? was: h4. Here is a draft of: Cassandra Proposal - Column-level permissions.docx (attached) h4. Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. h4. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: * http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra * https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. h4. Main day-one requirements # Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. # Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. # Enforce the column access restrictions during execution. Details: #* Should fit with the existing permission propagation down a role chain. #* Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" #* Error will report only the first checked column. Nice to have: list all inaccessible columns. #* Error code is the same as for table access denial: 2100. Additional day-one requirements # Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; # Performance should not degrade in any significant way. # Backwards compatibility #* Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. #* Previous CQL syntax will remain valid, and have the same effect as before. h4. Documentation #* https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission #* Feedback request: any others? > Column-level permissions > > > Key: CASSANDRA-12859 > URL: https://issues.apache.org/jira/browse/CASSANDR
[jira] [Updated] (CASSANDRA-12859) Column-level permissions
[ https://issues.apache.org/jira/browse/CASSANDRA-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-12859: -- Attachment: Cassandra Proposal - Column-level permissions.docx > Column-level permissions > > > Key: CASSANDRA-12859 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12859 > Project: Cassandra > Issue Type: New Feature > Components: Core, CQL >Reporter: Boris Melamed > Attachments: Cassandra Proposal - Column-level permissions.docx > > Original Estimate: 504h > Remaining Estimate: 504h > > Here is a draft of: > Cassandra Proposal - Column-level permissions.docx > https://ibm.box.com/s/ithyzt0bhlcfb49dl5x6us0c887p1ovw > Quoting the 'Overview' section: > The purpose of this proposal is to add column-level (field-level) permissions > to Cassandra. It is my intent to soon start implementing this feature in a > fork, and to submit a pull request once it’s ready. > Motivation > Cassandra already supports permissions on keyspace and table (column family) > level. Sources: > - http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra > - https://cassandra.apache.org/doc/latest/cql/security.html#data-control > At IBM, we have use cases in the area of big data analytics where > column-level access permissions are also a requirement. All industry RDBMS > products are supporting this level of permission control, and regulators are > expecting it from all data-based systems. > Main day-one requirements > 1.Extend CQL (Cassandra Query Language) to be able to optionally specify > a list of individual columns, in the GRANT statement. The relevant permission > types are: MODIFY (for UPDATE and INSERT) and SELECT. > 2.Persist the optional information in the appropriate system table > ‘system_auth.role_permissions’. > 3.Enforce the column access restrictions during execution. Details: > a. Should fit with the existing permission propagation down a role chain. > b. Proposed message format when a user’s roles give access to the queried > table but not to all of the selected, inserted, or updated columns: > "User %s has no %s permission on column %s of table %s" > c. Error will report only the first checked column. > Nice to have: list all inaccessible columns. > d. Error code is the same as for table access denial: 2100. > Additional day-one requirements > 4.Reflect the column-level permissions in statements of type > LIST ALL PERMISSIONS OF someuser; > 5.Performance should not degrade in any significant way. > 6.Backwards compatibility > a. Permission enforcement for DBs created before the upgrade should > continue to work with the same behavior after upgrading to a version that > allows column-level permissions. > b. Previous CQL syntax will remain valid, and have the same effect as > before. > 7.Documentation > o > https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission > o Feedback request: any others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12462) NullPointerException in CompactionInfo.getId(CompactionInfo.java:65)
[ https://issues.apache.org/jira/browse/CASSANDRA-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615987#comment-15615987 ] Simon Zhou commented on CASSANDRA-12462: [~yukim], could you help review patch v2? Thanks. > NullPointerException in CompactionInfo.getId(CompactionInfo.java:65) > > > Key: CASSANDRA-12462 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12462 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jonathan DePrizio > Attachments: > 0001-Fix-NPE-when-running-nodetool-compactionstats.patch, > CASSANDRA-12462-v2.patch > > > Note: The same trace is cited in the last comment of > https://issues.apache.org/jira/browse/CASSANDRA-11961 > I've noticed that some of my nodes in my 2.1 cluster have fallen way behind > on compactions, and have huge numbers (thousands) of uncompacted, tiny > SSTables (~30MB or so). > In diagnosing the issue, I've found that "nodetool compactionstats" returns > the exception below. Restarting cassandra on the node here causes the > pending tasks count to jump to ~2000. Compactions run properly for about an > hour, until this exception occurs again. Once it occurs, I see the pending > tasks value rapidly drop towards zero, but without any compactions actually > running (the logs show no compactions finishing). It would seem that this is > causing compactions to fail on this node, which is leading to it running out > of space, etc. > [redacted]# nodetool compactionstats > xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms12G -Xmx12G > -Xmn1000M -Xss255k > pending tasks: 5 > error: null > -- StackTrace -- > java.lang.NullPointerException > at > org.apache.cassandra.db.compaction.CompactionInfo.getId(CompactionInfo.java:65) > at > org.apache.cassandra.db.compaction.CompactionInfo.asMap(CompactionInfo.java:118) > at > org.apache.cassandra.db.compaction.CompactionManager.getCompactions(CompactionManager.java:1405) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at sun.reflect.misc.Trampoline.invoke(Unknown Source) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at sun.reflect.misc.MethodUtil.invoke(Unknown Source) > at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown > Source) > at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown > Source) > at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source) > at com.sun.jmx.mbeanserver.PerInterface.getAttribute(Unknown Source) > at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(Unknown Source) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(Unknown > Source) > at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(Unknown Source) > at javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown > Source) > at javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown > Source) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown > Source) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown > Source) > at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(Unknown > Source) > at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source) > at sun.rmi.transport.Transport$1.run(Unknown Source) > at sun.rmi.transport.Transport$1.run(Unknown Source) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Unknown Source) > at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown > Source) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615954#comment-15615954 ] Jeff Jirsa commented on CASSANDRA-11218: Was chatting with [~beobal] on IRC briefly about starvation, and figured it's worth mentioning here: {quote} 09:52 also worth thinking about - within a type/level, is size the best comparator? or hotness? 09:52 and if size, is largest first? or smallest first? 09:52 i'm not sure there's a universal right answer {quote} [~krummas] and [~kohlisankalp] - do you have thoughts on ordering compaction tasks properly within a type? Is there something that's universally more likely to be best for the user? Hotness? Size of sstables (sorted smallest to largest, or largest to smallest)? As written, it's doing the largest transactions first - do either of you have strong opinions that one specific approach is right or wrong? > Prioritize Secondary Index rebuild > -- > > Key: CASSANDRA-11218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11218 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: sankalp kohli >Assignee: Jeff Jirsa >Priority: Minor > > We have seen that secondary index rebuild get stuck behind other compaction > during a bootstrap and other operations. This causes things to not finish. We > should prioritize index rebuild via a separate thread pool or using a > priority queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615899#comment-15615899 ] Jeff Jirsa commented on CASSANDRA-11218: The refactor to break apart {{Priorities}} / {{Prioritized}} is much cleaner, thanks. {quote} Was there any special reasoning behind making the priority fields in PrioritizedCompactionFutureTask/PrioritizedCompactionCallable/PrioritizedCompactionWrappedRunnable atomics rather than just final? I couldn't see how they would ever be mutated, did I overlook something? {quote} I was somewhat worried about possible starvation within a type/level, where a pathological case caused recompaction of very large files (used to happen a lot in early DTCS), so my thought was on each compare, we could adjust (getandincrement) the subtype priority to add an element of "how long has it been in queue". I'm not sure if that's a real concern, so I backed out that to reduce complexity and make it easier to reason about. Will look at the rest this afternoon. > Prioritize Secondary Index rebuild > -- > > Key: CASSANDRA-11218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11218 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: sankalp kohli >Assignee: Jeff Jirsa >Priority: Minor > > We have seen that secondary index rebuild get stuck behind other compaction > during a bootstrap and other operations. This causes things to not finish. We > should prioritize index rebuild via a separate thread pool or using a > priority queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[10/10] cassandra git commit: Merge branch 'cassandra-3.X' into trunk
Merge branch 'cassandra-3.X' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e8b7c0d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e8b7c0d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e8b7c0d Branch: refs/heads/trunk Commit: 9e8b7c0d0d00e7f75f893906c6d52951bda0579e Parents: 0919ae2 9be467a Author: Sam Tunnicliffe Authored: Fri Oct 28 16:18:13 2016 +0100 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:18:13 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 13 -- .../cassandra/auth/PasswordAuthenticator.java | 42 ++-- 3 files changed, 41 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e8b7c0d/CHANGES.txt --
[jira] [Updated] (CASSANDRA-12813) NPE in auth for bootstrapping node
[ https://issues.apache.org/jira/browse/CASSANDRA-12813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-12813: Resolution: Fixed Fix Version/s: (was: 4.0) (was: 3.0.x) (was: 2.2.x) (was: 3.x) 3.10 3.0.10 2.2.9 Status: Resolved (was: Patch Available) Perfect, thanks [~ifesdjeen]. Committed to 2.2 in {{312e21bda7c50f05fc5f8868740b513022385951}} and merged to 3.0/3.X/trunk > NPE in auth for bootstrapping node > -- > > Key: CASSANDRA-12813 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12813 > Project: Cassandra > Issue Type: Bug >Reporter: Charles Mims >Assignee: Alex Petrov > Fix For: 2.2.9, 3.0.10, 3.10 > > > {code} > ERROR [SharedPool-Worker-1] 2016-10-19 21:40:25,991 Message.java:617 - > Unexpected exception during request; channel = [id: 0x15eb017f, / omitted>:40869 => /10.0.0.254:9042] > java.lang.NullPointerException: null > at > org.apache.cassandra.auth.PasswordAuthenticator.doAuthenticate(PasswordAuthenticator.java:144) > ~[apache-cassandra-3.0.9.jar:3.0.9] > at > org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:86) > ~[apache-cassandra-3.0.9.jar:3.0.9] > at > org.apache.cassandra.auth.PasswordAuthenticator.access$100(PasswordAuthenticator.java:54) > ~[apache-cassandra-3.0.9.jar:3.0.9] > at > org.apache.cassandra.auth.PasswordAuthenticator$PlainTextSaslAuthenticator.getAuthenticatedUser(PasswordAuthenticator.java:182) > ~[apache-cassandra-3.0.9.jar:3.0.9] > at > org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:78) > ~[apache-cassandra-3.0.9.jar:3.0.9] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) > [apache-cassandra-3.0.9.jar:3.0.9] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) > [apache-cassandra-3.0.9.jar:3.0.9] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_101] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > [apache-cassandra-3.0.9.jar:3.0.9] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [apache-cassandra-3.0.9.jar:3.0.9] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > {code} > I have a node that has been joining for around 24 hours. My application is > configured with the IP address of the joining node in the list of nodes to > connect to (ruby driver), and I have been getting around 200 events of this > NPE per hour. I removed the IP of the joining node from the list of nodes > for my app to connect to and the errors stopped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[01/10] cassandra git commit: Prepare legacy auth statements if tables initialised after node startup
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 eaf46a1c9 -> 312e21bda refs/heads/cassandra-3.0 e9ff6ae6f -> e4f840aa1 refs/heads/cassandra-3.X 49ce0a495 -> 9be467a22 refs/heads/trunk 0919ae2c2 -> 9e8b7c0d0 Prepare legacy auth statements if tables initialised after node startup Patch by Alex Petrov; reviewed by Sam Tunnicliffe for CASSANDRA-12813 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/312e21bd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/312e21bd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/312e21bd Branch: refs/heads/cassandra-2.2 Commit: 312e21bda7c50f05fc5f8868740b513022385951 Parents: eaf46a1 Author: Alex Petrov Authored: Fri Oct 21 16:58:33 2016 +0200 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:04:36 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 14 +-- .../cassandra/auth/PasswordAuthenticator.java | 40 ++-- 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a22439b..b33ef8d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.9 + * Prepare legacy authenticate statement if credentials table initialised after node startup (CASSANDRA-12813) * Change cassandra.wait_for_tracing_events_timeout_secs default to 0 (CASSANDRA-12754) * Clean up permissions when a UDA is dropped (CASSANDRA-12720) * Limit colUpdateTimeDelta histogram updates to reasonable deltas (CASSANDRA-7) http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- diff --git a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java index 88069a2..360d59a 100644 --- a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java +++ b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java @@ -209,11 +209,19 @@ public class CassandraAuthorizer implements IAuthorizer Lists.newArrayList(ByteBufferUtil.bytes(role.getRoleName()), ByteBufferUtil.bytes(resource.getName(; +SelectStatement statement; // If it exists, read from the legacy user permissions table to handle the case where the cluster // is being upgraded and so is running with mixed versions of the authz schema -SelectStatement statement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null -? authorizeRoleStatement -: legacyAuthorizeRoleStatement; +if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null) +statement = authorizeRoleStatement; +else +{ +// If the permissions table was initialised only after the statement got prepared, re-prepare (CASSANDRA-12813) +if (legacyAuthorizeRoleStatement == null) +legacyAuthorizeRoleStatement = prepare(USERNAME, USER_PERMISSIONS); +statement = legacyAuthorizeRoleStatement; +} + ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options) ; UntypedResultSet result = UntypedResultSet.create(rows.result); http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index c0d2283..20f8790 100644 --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@ -77,11 +77,7 @@ public class PasswordAuthenticator implements IAuthenticator { try { -// If the legacy users table exists try to verify credentials there. This is to handle the case -// where the cluster is being upgraded and so is running with mixed versions of the authn tables -SelectStatement authenticationStatement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, LEGACY_CREDENTIALS_TABLE) == null -? authenticateStatement -: legacyAuthenticat
[jira] [Updated] (CASSANDRA-11218) Prioritize Secondary Index rebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-11218: Status: Open (was: Patch Available) > Prioritize Secondary Index rebuild > -- > > Key: CASSANDRA-11218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11218 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: sankalp kohli >Assignee: Jeff Jirsa >Priority: Minor > > We have seen that secondary index rebuild get stuck behind other compaction > during a bootstrap and other operations. This causes things to not finish. We > should prioritize index rebuild via a separate thread pool or using a > priority queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[06/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e4f840aa Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e4f840aa Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e4f840aa Branch: refs/heads/cassandra-3.0 Commit: e4f840aa1f84f008cf7c14bdd6a22aebd1b41c70 Parents: e9ff6ae 312e21b Author: Sam Tunnicliffe Authored: Fri Oct 28 16:08:29 2016 +0100 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:08:29 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 14 +-- .../cassandra/auth/PasswordAuthenticator.java | 40 ++-- 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/CHANGES.txt -- diff --cc CHANGES.txt index 9910245,b33ef8d..bf1e7d6 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,30 -1,5 +1,31 @@@ -2.2.9 +3.0.10 + * Fix for KeyCacheCqlTest flakiness (CASSANDRA-12801) + * Include SSTable filename in compacting large row message (CASSANDRA-12384) + * Fix potential socket leak (CASSANDRA-12329, CASSANDRA-12330) + * Fix ViewTest.testCompaction (CASSANDRA-12789) + * Improve avg aggregate functions (CASSANDRA-12417) + * Preserve quoted reserved keyword column names in MV creation (CASSANDRA-11803) + * nodetool stopdaemon errors out (CASSANDRA-12646) + * Split materialized view mutations on build to prevent OOM (CASSANDRA-12268) + * mx4j does not work in 3.0.8 (CASSANDRA-12274) + * Abort cqlsh copy-from in case of no answer after prolonged period of time (CASSANDRA-12740) + * Avoid sstable corrupt exception due to dropped static column (CASSANDRA-12582) + * Make stress use client mode to avoid checking commit log size on startup (CASSANDRA-12478) + * Fix exceptions with new vnode allocation (CASSANDRA-12715) + * Unify drain and shutdown processes (CASSANDRA-12509) + * Fix NPE in ComponentOfSlice.isEQ() (CASSANDRA-12706) + * Fix failure in LogTransactionTest (CASSANDRA-12632) + * Fix potentially incomplete non-frozen UDT values when querying with the + full primary key specified (CASSANDRA-12605) + * Skip writing MV mutations to commitlog on mutation.applyUnsafe() (CASSANDRA-11670) + * Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns (CASSANDRA-12060) + * Extend ColumnIdentifier.internedInstances key to include the type that generated the byte buffer (CASSANDRA-12516) + * Backport CASSANDRA-10756 (race condition in NativeTransportService shutdown) (CASSANDRA-12472) + * If CF has no clustering columns, any row cache is full partition cache (CASSANDRA-12499) + * Correct log message for statistics of offheap memtable flush (CASSANDRA-12776) + * Explicitly set locale for string validation (CASSANDRA-12541,CASSANDRA-12542,CASSANDRA-12543,CASSANDRA-12545) +Merged from 2.2: + * Prepare legacy authenticate statement if credentials table initialised after node startup (CASSANDRA-12813) * Change cassandra.wait_for_tracing_events_timeout_secs default to 0 (CASSANDRA-12754) * Clean up permissions when a UDA is dropped (CASSANDRA-12720) * Limit colUpdateTimeDelta histogram updates to reasonable deltas (CASSANDRA-7) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java --
[03/10] cassandra git commit: Prepare legacy auth statements if tables initialised after node startup
Prepare legacy auth statements if tables initialised after node startup Patch by Alex Petrov; reviewed by Sam Tunnicliffe for CASSANDRA-12813 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/312e21bd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/312e21bd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/312e21bd Branch: refs/heads/cassandra-3.X Commit: 312e21bda7c50f05fc5f8868740b513022385951 Parents: eaf46a1 Author: Alex Petrov Authored: Fri Oct 21 16:58:33 2016 +0200 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:04:36 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 14 +-- .../cassandra/auth/PasswordAuthenticator.java | 40 ++-- 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a22439b..b33ef8d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.9 + * Prepare legacy authenticate statement if credentials table initialised after node startup (CASSANDRA-12813) * Change cassandra.wait_for_tracing_events_timeout_secs default to 0 (CASSANDRA-12754) * Clean up permissions when a UDA is dropped (CASSANDRA-12720) * Limit colUpdateTimeDelta histogram updates to reasonable deltas (CASSANDRA-7) http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- diff --git a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java index 88069a2..360d59a 100644 --- a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java +++ b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java @@ -209,11 +209,19 @@ public class CassandraAuthorizer implements IAuthorizer Lists.newArrayList(ByteBufferUtil.bytes(role.getRoleName()), ByteBufferUtil.bytes(resource.getName(; +SelectStatement statement; // If it exists, read from the legacy user permissions table to handle the case where the cluster // is being upgraded and so is running with mixed versions of the authz schema -SelectStatement statement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null -? authorizeRoleStatement -: legacyAuthorizeRoleStatement; +if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null) +statement = authorizeRoleStatement; +else +{ +// If the permissions table was initialised only after the statement got prepared, re-prepare (CASSANDRA-12813) +if (legacyAuthorizeRoleStatement == null) +legacyAuthorizeRoleStatement = prepare(USERNAME, USER_PERMISSIONS); +statement = legacyAuthorizeRoleStatement; +} + ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options) ; UntypedResultSet result = UntypedResultSet.create(rows.result); http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index c0d2283..20f8790 100644 --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@ -77,11 +77,7 @@ public class PasswordAuthenticator implements IAuthenticator { try { -// If the legacy users table exists try to verify credentials there. This is to handle the case -// where the cluster is being upgraded and so is running with mixed versions of the authn tables -SelectStatement authenticationStatement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, LEGACY_CREDENTIALS_TABLE) == null -? authenticateStatement -: legacyAuthenticateStatement; +SelectStatement authenticationStatement = authenticationStatement(); return doAuthenticate(username, password, authenticationStatement); } catch (RequestExecutionException e) @@
[09/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.X
Merge branch 'cassandra-3.0' into cassandra-3.X Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9be467a2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9be467a2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9be467a2 Branch: refs/heads/trunk Commit: 9be467a22ab646c23710eb2b23dd21e5a46b07bd Parents: 49ce0a4 e4f840a Author: Sam Tunnicliffe Authored: Fri Oct 28 16:11:13 2016 +0100 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:17:52 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 13 -- .../cassandra/auth/PasswordAuthenticator.java | 42 ++-- 3 files changed, 41 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9be467a2/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9be467a2/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- diff --cc src/java/org/apache/cassandra/auth/CassandraAuthorizer.java index 8c3485d,7ffef27..7f44eef --- a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java +++ b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java @@@ -215,12 -209,20 +215,19 @@@ public class CassandraAuthorizer implem Lists.newArrayList(ByteBufferUtil.bytes(role.getRoleName()), ByteBufferUtil.bytes(resource.getName(; + SelectStatement statement; // If it exists, read from the legacy user permissions table to handle the case where the cluster // is being upgraded and so is running with mixed versions of the authz schema - SelectStatement statement = Schema.instance.getCFMetaData(SchemaConstants.AUTH_KEYSPACE_NAME, USER_PERMISSIONS) == null - ? authorizeRoleStatement - : legacyAuthorizeRoleStatement; -if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null) ++if (Schema.instance.getCFMetaData(SchemaConstants.AUTH_KEYSPACE_NAME, USER_PERMISSIONS) == null) + statement = authorizeRoleStatement; + else + { + // If the permissions table was initialised only after the statement got prepared, re-prepare (CASSANDRA-12813) + if (legacyAuthorizeRoleStatement == null) + legacyAuthorizeRoleStatement = prepare(USERNAME, USER_PERMISSIONS); + statement = legacyAuthorizeRoleStatement; + } - -ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options) ; +ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options, System.nanoTime()); UntypedResultSet result = UntypedResultSet.create(rows.result); if (!result.isEmpty() && result.one().has(PERMISSIONS)) http://git-wip-us.apache.org/repos/asf/cassandra/blob/9be467a2/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --cc src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index b0317f3,ca610d1..4b667ae --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@@ -86,56 -78,8 +86,52 @@@ public class PasswordAuthenticator impl { try { +String hash = cache.get(username); +if (!BCrypt.checkpw(password, hash)) +throw new AuthenticationException(String.format("Provided username %s and/or password are incorrect", username)); + +return new AuthenticatedUser(username); +} +catch (ExecutionException | UncheckedExecutionException e) +{ +// the credentials were somehow invalid - either a non-existent role, or one without a defined password +if (e.getCause() instanceof NoSuchCredentialsException) +throw new AuthenticationException(String.format("Provided username %s and/or password are incorrect", username)); + +// an unanticipated exception occured whilst querying the credentials table +if (e.getCause() instanceof RequestExecutionException) +{ +logger.trace("Error performing internal authentication", e); +throw new AuthenticationException(String.format("Error during authentication of user %s : %s", username, e.getMessage())); +} + +
[jira] [Updated] (CASSANDRA-11218) Prioritize Secondary Index rebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-11218: Status: Awaiting Feedback (was: Open) > Prioritize Secondary Index rebuild > -- > > Key: CASSANDRA-11218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11218 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: sankalp kohli >Assignee: Jeff Jirsa >Priority: Minor > > We have seen that secondary index rebuild get stuck behind other compaction > during a bootstrap and other operations. This causes things to not finish. We > should prioritize index rebuild via a separate thread pool or using a > priority queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615708#comment-15615708 ] Sam Tunnicliffe commented on CASSANDRA-11218: - [~jjirsa], sorry it's taken so long to pick up this review, I got stuck on some unrelated stuff until just now. The concept and general approach are straightforward & seem reasonable enough. I had a bunch of remarks about the implementation, so I thought it would be simpler to just push a branch ([here|https://github.com/beobal/cassandra/tree/CASSANDRA-11218) with suggestions rather than trying to spell them out here. Let me know what you think, the summary is: * Refactored {{OperationType}} a bit. Aside from removing duplication, the version in my branch normalizes the property names so that the suffix is always the lowercase version of the type name (there are some inconsistencies in the original version - e.g. {{KEY_CACHE_SAVE:keycachesave}} vs {{USER_DEFINED_COMPACTION:user_defined_compaction}} vs {{GARBAGE_COLLECT:garbage_collection}}). * Simplified {{CompactionManager.CompactionPriorityComparator}}. * I feel like the way priorities are carried between the various {{IPrioritizedX}} classes is a bit messy. For example, {{CompactionManager::submitUserDefined}} creates an anonymous instance of {{PriotitizedCompactionWrappedRunnable}} that has a timestamp set in it's constructor. That's then submitted to the compaction executor, where {{newTaskFor}} wraps it in an {{IPrioritizedCompactionFutureTask}} using the runnable's type and subtype priority, but taking a new timestamp, dropping the one on the runnable. Ultimately, I ended up encapsulating the 3 priority values and just using that {{Priorities}} class everywhere (I also renamed {{IPrioritizedCompactionComparable}} to {{Prioritized}}). ** Was there any special reasoning behind making the priority fields in {{PrioritizedCompactionFutureTask}}/{{PrioritizedCompactionCallable}}/{{PrioritizedCompactionWrappedRunnable}} atomics rather than just final? I couldn't see how they would ever be mutated, did I overlook something? One thought that I didn't have chance to explore in that branch: * The {{instanceof}} checks and special casing to handle regular vs prioritized runnable is also duplicated between the {{newTaskFor}} methods and the comparator. Could we make {{CompactionExecutor::newTaskFor}} always return an {{IPrioritizedCompactionFutureTask}}, so that when the supplied runnable/callable is already prioritized it just uses its existing values, and when a non-prioritized runnable/callable is received it makes it prioritized with the default 0/0/0 values (which is functionally equivalent to what the comparator will do anyway). Then the comparator can be further simplified by not having to consider non-prioritized tasks. And a few more minor points: * On JIRA you suggested prioritizing anticompaction & index summary redistribution joint highest, but in the patch the latter has max priority (and is not overridable). Is that intentional? * Validation tasks are run a dedicated executor, which doesn't use a priority queue. This, and the fact that validation is orthogonal to compaction strategy means that all validation tasks are equal priority & also don't block the other tasks. So it makes the reasoning for explicity setting validation priority to 256 a little unclear. * On a related note, {{CacheCleanupExecutor}} is also dedicated to a single task type. This *does* end up using a priority queue, but only ever has vanilla runnables submitted to it. Finally: Some comments in {{IPrioritizedCompactionComparable/Prioritized}} describing its purpose & maybe the comparison algorithm (e.g. what's described in the comments here, above) could be useful. Also some comments in {{OperationType}} on the fact that priority relates to task scheduling in compaction manager, and why some types have un-overridable priority of {{MAX_VALUE}} might be handy. > Prioritize Secondary Index rebuild > -- > > Key: CASSANDRA-11218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11218 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: sankalp kohli >Assignee: Jeff Jirsa >Priority: Minor > > We have seen that secondary index rebuild get stuck behind other compaction > during a bootstrap and other operations. This causes things to not finish. We > should prioritize index rebuild via a separate thread pool or using a > priority queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[05/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e4f840aa Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e4f840aa Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e4f840aa Branch: refs/heads/cassandra-3.X Commit: e4f840aa1f84f008cf7c14bdd6a22aebd1b41c70 Parents: e9ff6ae 312e21b Author: Sam Tunnicliffe Authored: Fri Oct 28 16:08:29 2016 +0100 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:08:29 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 14 +-- .../cassandra/auth/PasswordAuthenticator.java | 40 ++-- 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/CHANGES.txt -- diff --cc CHANGES.txt index 9910245,b33ef8d..bf1e7d6 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,30 -1,5 +1,31 @@@ -2.2.9 +3.0.10 + * Fix for KeyCacheCqlTest flakiness (CASSANDRA-12801) + * Include SSTable filename in compacting large row message (CASSANDRA-12384) + * Fix potential socket leak (CASSANDRA-12329, CASSANDRA-12330) + * Fix ViewTest.testCompaction (CASSANDRA-12789) + * Improve avg aggregate functions (CASSANDRA-12417) + * Preserve quoted reserved keyword column names in MV creation (CASSANDRA-11803) + * nodetool stopdaemon errors out (CASSANDRA-12646) + * Split materialized view mutations on build to prevent OOM (CASSANDRA-12268) + * mx4j does not work in 3.0.8 (CASSANDRA-12274) + * Abort cqlsh copy-from in case of no answer after prolonged period of time (CASSANDRA-12740) + * Avoid sstable corrupt exception due to dropped static column (CASSANDRA-12582) + * Make stress use client mode to avoid checking commit log size on startup (CASSANDRA-12478) + * Fix exceptions with new vnode allocation (CASSANDRA-12715) + * Unify drain and shutdown processes (CASSANDRA-12509) + * Fix NPE in ComponentOfSlice.isEQ() (CASSANDRA-12706) + * Fix failure in LogTransactionTest (CASSANDRA-12632) + * Fix potentially incomplete non-frozen UDT values when querying with the + full primary key specified (CASSANDRA-12605) + * Skip writing MV mutations to commitlog on mutation.applyUnsafe() (CASSANDRA-11670) + * Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns (CASSANDRA-12060) + * Extend ColumnIdentifier.internedInstances key to include the type that generated the byte buffer (CASSANDRA-12516) + * Backport CASSANDRA-10756 (race condition in NativeTransportService shutdown) (CASSANDRA-12472) + * If CF has no clustering columns, any row cache is full partition cache (CASSANDRA-12499) + * Correct log message for statistics of offheap memtable flush (CASSANDRA-12776) + * Explicitly set locale for string validation (CASSANDRA-12541,CASSANDRA-12542,CASSANDRA-12543,CASSANDRA-12545) +Merged from 2.2: + * Prepare legacy authenticate statement if credentials table initialised after node startup (CASSANDRA-12813) * Change cassandra.wait_for_tracing_events_timeout_secs default to 0 (CASSANDRA-12754) * Clean up permissions when a UDA is dropped (CASSANDRA-12720) * Limit colUpdateTimeDelta histogram updates to reasonable deltas (CASSANDRA-7) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java --
[07/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e4f840aa Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e4f840aa Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e4f840aa Branch: refs/heads/trunk Commit: e4f840aa1f84f008cf7c14bdd6a22aebd1b41c70 Parents: e9ff6ae 312e21b Author: Sam Tunnicliffe Authored: Fri Oct 28 16:08:29 2016 +0100 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:08:29 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 14 +-- .../cassandra/auth/PasswordAuthenticator.java | 40 ++-- 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/CHANGES.txt -- diff --cc CHANGES.txt index 9910245,b33ef8d..bf1e7d6 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,30 -1,5 +1,31 @@@ -2.2.9 +3.0.10 + * Fix for KeyCacheCqlTest flakiness (CASSANDRA-12801) + * Include SSTable filename in compacting large row message (CASSANDRA-12384) + * Fix potential socket leak (CASSANDRA-12329, CASSANDRA-12330) + * Fix ViewTest.testCompaction (CASSANDRA-12789) + * Improve avg aggregate functions (CASSANDRA-12417) + * Preserve quoted reserved keyword column names in MV creation (CASSANDRA-11803) + * nodetool stopdaemon errors out (CASSANDRA-12646) + * Split materialized view mutations on build to prevent OOM (CASSANDRA-12268) + * mx4j does not work in 3.0.8 (CASSANDRA-12274) + * Abort cqlsh copy-from in case of no answer after prolonged period of time (CASSANDRA-12740) + * Avoid sstable corrupt exception due to dropped static column (CASSANDRA-12582) + * Make stress use client mode to avoid checking commit log size on startup (CASSANDRA-12478) + * Fix exceptions with new vnode allocation (CASSANDRA-12715) + * Unify drain and shutdown processes (CASSANDRA-12509) + * Fix NPE in ComponentOfSlice.isEQ() (CASSANDRA-12706) + * Fix failure in LogTransactionTest (CASSANDRA-12632) + * Fix potentially incomplete non-frozen UDT values when querying with the + full primary key specified (CASSANDRA-12605) + * Skip writing MV mutations to commitlog on mutation.applyUnsafe() (CASSANDRA-11670) + * Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns (CASSANDRA-12060) + * Extend ColumnIdentifier.internedInstances key to include the type that generated the byte buffer (CASSANDRA-12516) + * Backport CASSANDRA-10756 (race condition in NativeTransportService shutdown) (CASSANDRA-12472) + * If CF has no clustering columns, any row cache is full partition cache (CASSANDRA-12499) + * Correct log message for statistics of offheap memtable flush (CASSANDRA-12776) + * Explicitly set locale for string validation (CASSANDRA-12541,CASSANDRA-12542,CASSANDRA-12543,CASSANDRA-12545) +Merged from 2.2: + * Prepare legacy authenticate statement if credentials table initialised after node startup (CASSANDRA-12813) * Change cassandra.wait_for_tracing_events_timeout_secs default to 0 (CASSANDRA-12754) * Clean up permissions when a UDA is dropped (CASSANDRA-12720) * Limit colUpdateTimeDelta histogram updates to reasonable deltas (CASSANDRA-7) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e4f840aa/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java --
[02/10] cassandra git commit: Prepare legacy auth statements if tables initialised after node startup
Prepare legacy auth statements if tables initialised after node startup Patch by Alex Petrov; reviewed by Sam Tunnicliffe for CASSANDRA-12813 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/312e21bd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/312e21bd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/312e21bd Branch: refs/heads/cassandra-3.0 Commit: 312e21bda7c50f05fc5f8868740b513022385951 Parents: eaf46a1 Author: Alex Petrov Authored: Fri Oct 21 16:58:33 2016 +0200 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:04:36 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 14 +-- .../cassandra/auth/PasswordAuthenticator.java | 40 ++-- 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a22439b..b33ef8d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.9 + * Prepare legacy authenticate statement if credentials table initialised after node startup (CASSANDRA-12813) * Change cassandra.wait_for_tracing_events_timeout_secs default to 0 (CASSANDRA-12754) * Clean up permissions when a UDA is dropped (CASSANDRA-12720) * Limit colUpdateTimeDelta histogram updates to reasonable deltas (CASSANDRA-7) http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- diff --git a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java index 88069a2..360d59a 100644 --- a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java +++ b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java @@ -209,11 +209,19 @@ public class CassandraAuthorizer implements IAuthorizer Lists.newArrayList(ByteBufferUtil.bytes(role.getRoleName()), ByteBufferUtil.bytes(resource.getName(; +SelectStatement statement; // If it exists, read from the legacy user permissions table to handle the case where the cluster // is being upgraded and so is running with mixed versions of the authz schema -SelectStatement statement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null -? authorizeRoleStatement -: legacyAuthorizeRoleStatement; +if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null) +statement = authorizeRoleStatement; +else +{ +// If the permissions table was initialised only after the statement got prepared, re-prepare (CASSANDRA-12813) +if (legacyAuthorizeRoleStatement == null) +legacyAuthorizeRoleStatement = prepare(USERNAME, USER_PERMISSIONS); +statement = legacyAuthorizeRoleStatement; +} + ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options) ; UntypedResultSet result = UntypedResultSet.create(rows.result); http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index c0d2283..20f8790 100644 --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@ -77,11 +77,7 @@ public class PasswordAuthenticator implements IAuthenticator { try { -// If the legacy users table exists try to verify credentials there. This is to handle the case -// where the cluster is being upgraded and so is running with mixed versions of the authn tables -SelectStatement authenticationStatement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, LEGACY_CREDENTIALS_TABLE) == null -? authenticateStatement -: legacyAuthenticateStatement; +SelectStatement authenticationStatement = authenticationStatement(); return doAuthenticate(username, password, authenticationStatement); } catch (RequestExecutionException e) @@
[04/10] cassandra git commit: Prepare legacy auth statements if tables initialised after node startup
Prepare legacy auth statements if tables initialised after node startup Patch by Alex Petrov; reviewed by Sam Tunnicliffe for CASSANDRA-12813 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/312e21bd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/312e21bd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/312e21bd Branch: refs/heads/trunk Commit: 312e21bda7c50f05fc5f8868740b513022385951 Parents: eaf46a1 Author: Alex Petrov Authored: Fri Oct 21 16:58:33 2016 +0200 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:04:36 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 14 +-- .../cassandra/auth/PasswordAuthenticator.java | 40 ++-- 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a22439b..b33ef8d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.9 + * Prepare legacy authenticate statement if credentials table initialised after node startup (CASSANDRA-12813) * Change cassandra.wait_for_tracing_events_timeout_secs default to 0 (CASSANDRA-12754) * Clean up permissions when a UDA is dropped (CASSANDRA-12720) * Limit colUpdateTimeDelta histogram updates to reasonable deltas (CASSANDRA-7) http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- diff --git a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java index 88069a2..360d59a 100644 --- a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java +++ b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java @@ -209,11 +209,19 @@ public class CassandraAuthorizer implements IAuthorizer Lists.newArrayList(ByteBufferUtil.bytes(role.getRoleName()), ByteBufferUtil.bytes(resource.getName(; +SelectStatement statement; // If it exists, read from the legacy user permissions table to handle the case where the cluster // is being upgraded and so is running with mixed versions of the authz schema -SelectStatement statement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null -? authorizeRoleStatement -: legacyAuthorizeRoleStatement; +if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null) +statement = authorizeRoleStatement; +else +{ +// If the permissions table was initialised only after the statement got prepared, re-prepare (CASSANDRA-12813) +if (legacyAuthorizeRoleStatement == null) +legacyAuthorizeRoleStatement = prepare(USERNAME, USER_PERMISSIONS); +statement = legacyAuthorizeRoleStatement; +} + ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options) ; UntypedResultSet result = UntypedResultSet.create(rows.result); http://git-wip-us.apache.org/repos/asf/cassandra/blob/312e21bd/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index c0d2283..20f8790 100644 --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@ -77,11 +77,7 @@ public class PasswordAuthenticator implements IAuthenticator { try { -// If the legacy users table exists try to verify credentials there. This is to handle the case -// where the cluster is being upgraded and so is running with mixed versions of the authn tables -SelectStatement authenticationStatement = Schema.instance.getCFMetaData(AuthKeyspace.NAME, LEGACY_CREDENTIALS_TABLE) == null -? authenticateStatement -: legacyAuthenticateStatement; +SelectStatement authenticationStatement = authenticationStatement(); return doAuthenticate(username, password, authenticationStatement); } catch (RequestExecutionException e) @@ -91,6 +
[08/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.X
Merge branch 'cassandra-3.0' into cassandra-3.X Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9be467a2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9be467a2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9be467a2 Branch: refs/heads/cassandra-3.X Commit: 9be467a22ab646c23710eb2b23dd21e5a46b07bd Parents: 49ce0a4 e4f840a Author: Sam Tunnicliffe Authored: Fri Oct 28 16:11:13 2016 +0100 Committer: Sam Tunnicliffe Committed: Fri Oct 28 16:17:52 2016 +0100 -- CHANGES.txt | 1 + .../cassandra/auth/CassandraAuthorizer.java | 13 -- .../cassandra/auth/PasswordAuthenticator.java | 42 ++-- 3 files changed, 41 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9be467a2/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9be467a2/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java -- diff --cc src/java/org/apache/cassandra/auth/CassandraAuthorizer.java index 8c3485d,7ffef27..7f44eef --- a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java +++ b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java @@@ -215,12 -209,20 +215,19 @@@ public class CassandraAuthorizer implem Lists.newArrayList(ByteBufferUtil.bytes(role.getRoleName()), ByteBufferUtil.bytes(resource.getName(; + SelectStatement statement; // If it exists, read from the legacy user permissions table to handle the case where the cluster // is being upgraded and so is running with mixed versions of the authz schema - SelectStatement statement = Schema.instance.getCFMetaData(SchemaConstants.AUTH_KEYSPACE_NAME, USER_PERMISSIONS) == null - ? authorizeRoleStatement - : legacyAuthorizeRoleStatement; -if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, USER_PERMISSIONS) == null) ++if (Schema.instance.getCFMetaData(SchemaConstants.AUTH_KEYSPACE_NAME, USER_PERMISSIONS) == null) + statement = authorizeRoleStatement; + else + { + // If the permissions table was initialised only after the statement got prepared, re-prepare (CASSANDRA-12813) + if (legacyAuthorizeRoleStatement == null) + legacyAuthorizeRoleStatement = prepare(USERNAME, USER_PERMISSIONS); + statement = legacyAuthorizeRoleStatement; + } - -ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options) ; +ResultMessage.Rows rows = statement.execute(QueryState.forInternalCalls(), options, System.nanoTime()); UntypedResultSet result = UntypedResultSet.create(rows.result); if (!result.isEmpty() && result.one().has(PERMISSIONS)) http://git-wip-us.apache.org/repos/asf/cassandra/blob/9be467a2/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --cc src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index b0317f3,ca610d1..4b667ae --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@@ -86,56 -78,8 +86,52 @@@ public class PasswordAuthenticator impl { try { +String hash = cache.get(username); +if (!BCrypt.checkpw(password, hash)) +throw new AuthenticationException(String.format("Provided username %s and/or password are incorrect", username)); + +return new AuthenticatedUser(username); +} +catch (ExecutionException | UncheckedExecutionException e) +{ +// the credentials were somehow invalid - either a non-existent role, or one without a defined password +if (e.getCause() instanceof NoSuchCredentialsException) +throw new AuthenticationException(String.format("Provided username %s and/or password are incorrect", username)); + +// an unanticipated exception occured whilst querying the credentials table +if (e.getCause() instanceof RequestExecutionException) +{ +logger.trace("Error performing internal authentication", e); +throw new AuthenticationException(String.format("Error during authentication of user %s : %s", username, e.getMessage())); +} + +
[jira] [Created] (CASSANDRA-12859) Column-level permissions
Boris Melamed created CASSANDRA-12859: - Summary: Column-level permissions Key: CASSANDRA-12859 URL: https://issues.apache.org/jira/browse/CASSANDRA-12859 Project: Cassandra Issue Type: New Feature Components: Core, CQL Reporter: Boris Melamed Here is a draft of: Cassandra Proposal - Column-level permissions.docx https://ibm.box.com/s/ithyzt0bhlcfb49dl5x6us0c887p1ovw Quoting the 'Overview' section: The purpose of this proposal is to add column-level (field-level) permissions to Cassandra. It is my intent to soon start implementing this feature in a fork, and to submit a pull request once it’s ready. Motivation Cassandra already supports permissions on keyspace and table (column family) level. Sources: - http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra - https://cassandra.apache.org/doc/latest/cql/security.html#data-control At IBM, we have use cases in the area of big data analytics where column-level access permissions are also a requirement. All industry RDBMS products are supporting this level of permission control, and regulators are expecting it from all data-based systems. Main day-one requirements 1. Extend CQL (Cassandra Query Language) to be able to optionally specify a list of individual columns, in the GRANT statement. The relevant permission types are: MODIFY (for UPDATE and INSERT) and SELECT. 2. Persist the optional information in the appropriate system table ‘system_auth.role_permissions’. 3. Enforce the column access restrictions during execution. Details: a.Should fit with the existing permission propagation down a role chain. b.Proposed message format when a user’s roles give access to the queried table but not to all of the selected, inserted, or updated columns: "User %s has no %s permission on column %s of table %s" c.Error will report only the first checked column. Nice to have: list all inaccessible columns. d.Error code is the same as for table access denial: 2100. Additional day-one requirements 4. Reflect the column-level permissions in statements of type LIST ALL PERMISSIONS OF someuser; 5. Performance should not degrade in any significant way. 6. Backwards compatibility a.Permission enforcement for DBs created before the upgrade should continue to work with the same behavior after upgrading to a version that allows column-level permissions. b.Previous CQL syntax will remain valid, and have the same effect as before. 7. Documentation o https://cassandra.apache.org/doc/latest/cql/security.html#grammar-token-permission o Feedback request: any others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12838) Extend native protocol flags and add supported versions to the SUPPORTED response
[ https://issues.apache.org/jira/browse/CASSANDRA-12838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg updated CASSANDRA-12838: -- Labels: client-impacting (was: ) > Extend native protocol flags and add supported versions to the SUPPORTED > response > - > > Key: CASSANDRA-12838 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12838 > Project: Cassandra > Issue Type: Sub-task > Components: CQL >Reporter: Stefania >Assignee: Stefania > Labels: client-impacting > Fix For: 3.x > > > We already use 7 bits for the flags of the QUERY message, and since they are > encoded with a fixed size byte, we may be forced to change the structure of > the message soon, and I'd like to do this in version 5 but without wasting > bytes on the wire. Therefore, I propose to convert fixed flag's bytes to > unsigned vints, as defined in CASSANDRA-9499. The only exception would be the > flags in the frame, which should stay as fixed size. > Up to 7 bits, vints are encoded the same as bytes are, so no immediate change > would be required in the drivers, although they should plan to support vint > flags if supporting version 5. Moving forward, when a new flag is required > for the QUERY message, and eventually when other flags reach 8 bits in other > messages too, the flag's bitmaps would be automatically encoded with a size > that is big enough to accommodate all flags, but no bigger than required. We > can currently support up to 8 bytes with unsigned vints. > The downside is that drivers need to implement unsigned vint encoding for > version 5, but this is already required by CASSANDRA-11873, and will most > likely be required by CASSANDRA-11622 as well. > I would also like to add the list of versions to the SUPPORTED message, in > order to simplify the handshake for drivers that prefer to send an OPTION > message, rather than rely on receiving an error for an unsupported version in > the STARTUP message. Said error should also contain the full list of > supported versions, not just the min and max, for clarity, and because the > latest version is now a beta version. > Finally, we currently store versions as integer constants in {{Server.java}}, > and we still have a fair bit of hard-coded numbers in the code, especially in > tests. I plan to clean this up by introducing a {{ProtocolVersion}} enum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12858) testall failure in org.apache.cassandra.dht.Murmur3PartitionerTest.testSplitWrapping-compression
Sean McCarthy created CASSANDRA-12858: - Summary: testall failure in org.apache.cassandra.dht.Murmur3PartitionerTest.testSplitWrapping-compression Key: CASSANDRA-12858 URL: https://issues.apache.org/jira/browse/CASSANDRA-12858 Project: Cassandra Issue Type: Bug Reporter: Sean McCarthy example failure: http://cassci.datastax.com/job/cassandra-3.X_testall/49/testReport/org.apache.cassandra.dht/Murmur3PartitionerTest/testSplitWrapping_compression/ {code} Error Message For 8833996864316961974,8833996864316961979: range did not contain new token:8833996864316961974 {code}{code} Stacktrace junit.framework.AssertionFailedError: For 8833996864316961974,8833996864316961979: range did not contain new token:8833996864316961974 at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:138) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:150) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:148) at org.apache.cassandra.dht.PartitionerTestCase.assertSplit(PartitionerTestCase.java:129) at org.apache.cassandra.dht.Murmur3PartitionerTest.testSplitWrapping(Murmur3PartitionerTest.java:50) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8303) Create a capability limitation framework
[ https://issues.apache.org/jira/browse/CASSANDRA-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8303: - Reviewer: (was: Aleksey Yeschenko) > Create a capability limitation framework > > > Key: CASSANDRA-8303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8303 > Project: Cassandra > Issue Type: Improvement > Components: Distributed Metadata >Reporter: Anupam Arora > Fix For: 3.x > > > In addition to our current Auth framework that acts as a white list, and > regulates access to data, functions, and roles, it would be beneficial to > have a different, capability limitation framework, that would be orthogonal > to Auth, and would act as a blacklist. > Example uses: > - take away the ability to TRUNCATE from all users but the admin (TRUNCATE > itself would still require MODIFY permission) > - take away the ability to use ALLOW FILTERING from all users but > Spark/Hadoop (SELECT would still require SELECT permission) > - take away the ability to use UNLOGGED BATCH from everyone (the operation > itself would still require MODIFY permission) > - take away the ability to use certain consistency levels (make certain > tables LWT-only for all users, for example) > Original description: > Please provide a "strict mode" option in cassandra that will kick out any CQL > queries that are expensive, e.g. any query with ALLOWS FILTERING, > multi-partition queries, secondary index queries, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12728) Handling partially written hint files
[ https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-12728: -- Status: Open (was: Patch Available) > Handling partially written hint files > - > > Key: CASSANDRA-12728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12728 > Project: Cassandra > Issue Type: Bug >Reporter: Sharvanath Pathak >Assignee: Aleksey Yeschenko > Labels: lhf > Attachments: CASSANDRA-12728.patch > > > {noformat} > ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > [apache-cassandra-3.0.6.jar:3.0.6] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > Caused by: java.io.EOFException: null > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278) > ~[apache-cassandra-3.0.6.jar:3.0.6] > ... 15 common frames omitted > {noformat} > We've found out that the hint file was truncated because there was a hard > reboot around the time of last write to the file. I think we basically need > to handle partially written hint files. Also, the CRC file does not exist in > this case (probably because it crashed while writing the hints file). May be > ignoring and cleaning up such partially written hint files can be a way to > fix this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12857) Upgrade procedure between 2.1.x and 3.0.x is broken
Alexander Yasnogor created CASSANDRA-12857: -- Summary: Upgrade procedure between 2.1.x and 3.0.x is broken Key: CASSANDRA-12857 URL: https://issues.apache.org/jira/browse/CASSANDRA-12857 Project: Cassandra Issue Type: Bug Reporter: Alexander Yasnogor Priority: Critical It is not possible safely to do Cassandra in place upgrade from 2.1.14 to 3.0.9. Distribution: deb packages from datastax community repo. The upgrade was performed according to procedure from this docu: https://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgrdCassandraDetails.html Potential reason: The upgrade procedure creates corrupted system_schema and this keyspace get populated in the cluster and kills it. We started with one datacenter which contains 19 nodes divided to two racks. First rack was successfully upgraded and nodetool describecluster reported two schema versions. One for upgraded nodes, another for non-upgraded nodes. On starting new version on a first node from the second rack: {code:java} INFO [main] 2016-10-25 13:06:12,103 LegacySchemaMigrator.java:87 - Moving 11 keyspaces from legacy schema tables to the new schema keyspace (system_schema) INFO [main] 2016-10-25 13:06:12,104 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@7505e6ac INFO [main] 2016-10-25 13:06:12,200 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@64414574 INFO [main] 2016-10-25 13:06:12,204 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@3f2c5f45 INFO [main] 2016-10-25 13:06:12,207 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@2bc2d64d INFO [main] 2016-10-25 13:06:12,301 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@77343846 INFO [main] 2016-10-25 13:06:12,305 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@19b0b931 INFO [main] 2016-10-25 13:06:12,308 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@44bb0b35 INFO [main] 2016-10-25 13:06:12,311 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@79f6cd51 INFO [main] 2016-10-25 13:06:12,319 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@2fcd363b INFO [main] 2016-10-25 13:06:12,356 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@609eead6 INFO [main] 2016-10-25 13:06:12,358 LegacySchemaMigrator.java:148 - Migrating keyspace org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@7eb7f5d0 INFO [main] 2016-10-25 13:06:13,958 LegacySchemaMigrator.java:97 - Truncating legacy schema tables INFO [main] 2016-10-25 13:06:26,474 LegacySchemaMigrator.java:103 - Completed migration of legacy schema tables INFO [main] 2016-10-25 13:06:26,474 StorageService.java:521 - Populating token metadata from system tables INFO [main] 2016-10-25 13:06:26,796 StorageService.java:528 - Token metadata: Normal Tokens: [HUGE LIST of tokens] INFO [main] 2016-10-25 13:06:29,066 ColumnFamilyStore.java:389 - Initializing ... INFO [main] 2016-10-25 13:06:29,066 ColumnFamilyStore.java:389 - Initializing ... INFO [main] 2016-10-25 13:06:45,894 AutoSavingCache.java:165 - Completed loading (2 ms; 460 keys) KeyCache cache INFO [main] 2016-10-25 13:06:46,982 StorageService.java:521 - Populating token metadata from system tables INFO [main] 2016-10-25 13:06:47,394 StorageService.java:528 - Token metadata: Normal Tokens:[HUGE LIST of tokens] INFO [main] 2016-10-25 13:06:47,420 LegacyHintsMigrator.java:88 - Migrating legacy hints to new storage INFO [main] 2016-10-25 13:06:47,420 LegacyHintsMigrator.java:91 - Forcing a major compaction of system.hints table INFO [main] 2016-10-25 13:06:50,587 LegacyHintsMigrator.java:95 - Writing legacy hints to the new storage INFO [main] 2016-10-25 13:06:53,927 LegacyHintsMigrator.java:99 - Truncating system.hints table INFO [main] 2016-10-25 13:06:56,572 MigrationManager.java:342 - Create new table: org.apache.cassandra.config.CFMetaData@242e5306[cfId=c5e99f16-8677-3914-b17e-960613512345,ksName=system_traces,cfName=sessions,flags=[COMPOUND],params=TableParams{comment=tracing sessions, read_repair_chance=0.0, dclocal_read_repair_chance=0.0, bloom_filter_fp_chance=0.01, crc_check_chance=1.0, gc_grace_seconds=0, default_time_to_live=0, memtable_flush_period_in_ms=360, min_index_interval=128, max_index_interval=2048, speculative_retry=99PERCENTILE, caching={'keys' : 'ALL', 'rows_per_partition' :
[jira] [Created] (CASSANDRA-12856) dtest failure in replication_test.SnitchConfigurationUpdateTest.test_cannot_restart_with_different_rack
Sean McCarthy created CASSANDRA-12856: - Summary: dtest failure in replication_test.SnitchConfigurationUpdateTest.test_cannot_restart_with_different_rack Key: CASSANDRA-12856 URL: https://issues.apache.org/jira/browse/CASSANDRA-12856 Project: Cassandra Issue Type: Test Reporter: Sean McCarthy Assignee: DS Test Eng Attachments: node1.log example failure: http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/280/testReport/replication_test/SnitchConfigurationUpdateTest/test_cannot_restart_with_different_rack {code} Error Message Problem stopping node node1 {code}{code} Stacktrace File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/automaton/cassandra-dtest/replication_test.py", line 630, in test_cannot_restart_with_different_rack node1.stop(wait_other_notice=True) File "/usr/local/lib/python2.7/dist-packages/ccmlib/node.py", line 727, in stop raise NodeError("Problem stopping node %s" % self.name) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-12855) Correct Spelling Errors in IEndPointSnitch JavaDocs
[ https://issues.apache.org/jira/browse/CASSANDRA-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Licata updated CASSANDRA-12855: Comment: was deleted (was: Patch available [here|https://github.com/apache/cassandra/pull/79] ) > Correct Spelling Errors in IEndPointSnitch JavaDocs > --- > > Key: CASSANDRA-12855 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12855 > Project: Cassandra > Issue Type: Task > Components: Distributed Metadata >Reporter: Christopher Licata >Priority: Trivial > Labels: lhf > Fix For: 3.x > > > There are some spelling errors in the JavaDocs for IEndpointSnitch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12855) Correct Spelling Errors in IEndPointSnitch JavaDocs
[ https://issues.apache.org/jira/browse/CASSANDRA-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615403#comment-15615403 ] Christopher Licata commented on CASSANDRA-12855: - Patch available [here|https://github.com/apache/cassandra/pull/79] > Correct Spelling Errors in IEndPointSnitch JavaDocs > --- > > Key: CASSANDRA-12855 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12855 > Project: Cassandra > Issue Type: Task > Components: Distributed Metadata >Reporter: Christopher Licata >Priority: Trivial > Labels: lhf > Fix For: 3.x > > > There are some spelling errors in the JavaDocs for IEndpointSnitch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12855) Correct Spelling Errors in IEndPointSnitch JavaDocs
[ https://issues.apache.org/jira/browse/CASSANDRA-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615384#comment-15615384 ] ASF GitHub Bot commented on CASSANDRA-12855: GitHub user cmlicata opened a pull request: https://github.com/apache/cassandra/pull/79 Correct Spelling Errors in JavaDoc for IEndPointSnitch Addresses issue [CASSANDRA-12855](https://issues.apache.org/jira/browse/CASSANDRA-12855). You can merge this pull request into a Git repository by running: $ git pull https://github.com/cmlicata/cassandra 12855-trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cassandra/pull/79.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #79 commit fcc1df484d00ffad19b280f4cef9d07f2dd6b70d Author: Christopher Licata (xle012) Date: 2016-10-28T13:02:23Z correct JavaDoc in IEndPointSnitch > Correct Spelling Errors in IEndPointSnitch JavaDocs > --- > > Key: CASSANDRA-12855 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12855 > Project: Cassandra > Issue Type: Task > Components: Distributed Metadata >Reporter: Christopher Licata >Priority: Trivial > Labels: lhf > Fix For: 3.x > > > There are some spelling errors in the JavaDocs for IEndpointSnitch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12855) Correct Spelling Errors in IEndPointSnitch JavaDocs
Christopher Licata created CASSANDRA-12855: --- Summary: Correct Spelling Errors in IEndPointSnitch JavaDocs Key: CASSANDRA-12855 URL: https://issues.apache.org/jira/browse/CASSANDRA-12855 Project: Cassandra Issue Type: Task Components: Distributed Metadata Reporter: Christopher Licata Priority: Trivial Fix For: 3.x There are some spelling errors in the JavaDocs for IEndpointSnitch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615311#comment-15615311 ] Benjamin Lerer commented on CASSANDRA-12539: What happened is that the node crashed while a new commit log segment why Cassandra was trying to get a new file pointer to resize the commit log segment. Neverthess, it is the expected behaviour that Cassandra should not start if some commitlog segments are empty (there are some tests that check that behaviour). An empty commitlog is a sign of a problem but Cassandra as no way to determine the root cause of the problem that is why it will not start and notify the administrator. > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfigurationLoader.java:85 - > Configuration location: file:/etc/cassandra/cassandra.yaml > {noformat} > Deleting the empty file fixes the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12854) CommitLogTest.testDeleteIfNotDirty failed in 3.X
[ https://issues.apache.org/jira/browse/CASSANDRA-12854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-12854: - Description: Example failure: http://cassci.datastax.com/view/cassandra-3.X/job/cassandra-3.X_testall/31/testReport/junit/org.apache.cassandra.db.commitlog/CommitLogTest/testDeleteIfNotDirty_3__compression/ {code} expected:<1> but was:<2> Stacktrace junit.framework.AssertionFailedError: expected:<1> but was:<2> at org.apache.cassandra.db.commitlog.CommitLogTest.testDeleteIfNotDirty(CommitLogTest.java:305) {code} was: Example failure: http://cassci.datastax.com/view/cassandra-3.X/job/cassandra-3.X_testall/31/testReport/junit/org.apache.cassandra.db.commitlog/CommitLogTest/testDeleteIfNotDirty_3__compression/ > CommitLogTest.testDeleteIfNotDirty failed in 3.X > > > Key: CASSANDRA-12854 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12854 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 3.x > > > Example failure: > http://cassci.datastax.com/view/cassandra-3.X/job/cassandra-3.X_testall/31/testReport/junit/org.apache.cassandra.db.commitlog/CommitLogTest/testDeleteIfNotDirty_3__compression/ > {code} > expected:<1> but was:<2> > Stacktrace > junit.framework.AssertionFailedError: expected:<1> but was:<2> > at > org.apache.cassandra.db.commitlog.CommitLogTest.testDeleteIfNotDirty(CommitLogTest.java:305) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12854) CommitLogTest.testDeleteIfNotDirty failed in 3.X
Stefania created CASSANDRA-12854: Summary: CommitLogTest.testDeleteIfNotDirty failed in 3.X Key: CASSANDRA-12854 URL: https://issues.apache.org/jira/browse/CASSANDRA-12854 Project: Cassandra Issue Type: Bug Reporter: Stefania Assignee: Stefania Fix For: 3.x Example failure: http://cassci.datastax.com/view/cassandra-3.X/job/cassandra-3.X_testall/31/testReport/junit/org.apache.cassandra.db.commitlog/CommitLogTest/testDeleteIfNotDirty_3__compression/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12281) Gossip blocks on startup when another node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-12281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-12281: --- Status: Awaiting Feedback (was: Open) > Gossip blocks on startup when another node is bootstrapping > --- > > Key: CASSANDRA-12281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12281 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Eric Evans >Assignee: Stefan Podkowinski > Attachments: restbase1015-a_jstack.txt > > > In our cluster, normal node startup times (after a drain on shutdown) are > less than 1 minute. However, when another node in the cluster is > bootstrapping, the same node startup takes nearly 30 minutes to complete, the > apparent result of gossip blocking on pending range calculations. > {noformat} > $ nodetool-a tpstats > Pool NameActive Pending Completed Blocked All > time blocked > MutationStage 0 0 1840 0 > 0 > ReadStage 0 0 2350 0 > 0 > RequestResponseStage 0 0 53 0 > 0 > ReadRepairStage 0 0 1 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > HintedHandoff 0 0 44 0 > 0 > MiscStage 0 0 0 0 > 0 > CompactionExecutor3 3395 0 > 0 > MemtableReclaimMemory 0 0 30 0 > 0 > PendingRangeCalculator1 2 29 0 > 0 > GossipStage 1 5602164 0 > 0 > MigrationStage0 0 0 0 > 0 > MemtablePostFlush 0 0111 0 > 0 > ValidationExecutor0 0 0 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 0 0 30 0 > 0 > InternalResponseStage 0 0 0 0 > 0 > AntiEntropyStage 0 0 0 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > Message type Dropped > READ 0 > RANGE_SLICE 0 > _TRACE 0 > MUTATION 0 > COUNTER_MUTATION 0 > REQUEST_RESPONSE 0 > PAGED_RANGE 0 > READ_REPAIR 0 > {noformat} > A full thread dump is attached, but the relevant bit seems to be here: > {noformat} > [ ... ] > "GossipStage:1" #1801 daemon prio=5 os_prio=0 tid=0x7fe4cd54b000 > nid=0xea9 waiting on condition [0x7fddcf883000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0004c1e922c0> (a > java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:174) > at > org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:160) > at > org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2023) > at > org.apache.cassandra.service.StorageService.onChange(StorageService.java:1682) > at > org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1182) > at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1165) > at > org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1128) > at > org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:58) > a
[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614759#comment-15614759 ] Romain Hardouin commented on CASSANDRA-12649: - Yes, these metrics are useful. Unfortunately the patch doesn't merge anymore neither to cassandra-3.X branch nor to the trunk. > Add BATCH metrics > - > > Key: CASSANDRA-12649 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12649 > Project: Cassandra > Issue Type: Wish >Reporter: Alwyn Davis >Priority: Minor > Fix For: 3.x > > Attachments: trunk-12649.txt > > > To identify causes of load on a cluster, it would be useful to have some > additional metrics: > * *Mutation size distribution:* I believe this would be relevant when > tracking the performance of unlogged batches. > * *Logged / Unlogged Partitions per batch distribution:* This would also give > a count of batch types processed. Multiple distinct tables in batch would > just be considered as separate partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12838) Extend native protocol flags and add supported versions to the SUPPORTED response
[ https://issues.apache.org/jira/browse/CASSANDRA-12838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614737#comment-15614737 ] Benjamin Lerer commented on CASSANDRA-12838: +1 Thanks for the patch. > Extend native protocol flags and add supported versions to the SUPPORTED > response > - > > Key: CASSANDRA-12838 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12838 > Project: Cassandra > Issue Type: Sub-task > Components: CQL >Reporter: Stefania >Assignee: Stefania > Fix For: 3.x > > > We already use 7 bits for the flags of the QUERY message, and since they are > encoded with a fixed size byte, we may be forced to change the structure of > the message soon, and I'd like to do this in version 5 but without wasting > bytes on the wire. Therefore, I propose to convert fixed flag's bytes to > unsigned vints, as defined in CASSANDRA-9499. The only exception would be the > flags in the frame, which should stay as fixed size. > Up to 7 bits, vints are encoded the same as bytes are, so no immediate change > would be required in the drivers, although they should plan to support vint > flags if supporting version 5. Moving forward, when a new flag is required > for the QUERY message, and eventually when other flags reach 8 bits in other > messages too, the flag's bitmaps would be automatically encoded with a size > that is big enough to accommodate all flags, but no bigger than required. We > can currently support up to 8 bytes with unsigned vints. > The downside is that drivers need to implement unsigned vint encoding for > version 5, but this is already required by CASSANDRA-11873, and will most > likely be required by CASSANDRA-11622 as well. > I would also like to add the list of versions to the SUPPORTED message, in > order to simplify the handshake for drivers that prefer to send an OPTION > message, rather than rely on receiving an error for an unsupported version in > the STARTUP message. Said error should also contain the full list of > supported versions, not just the min and max, for clarity, and because the > latest version is now a beta version. > Finally, we currently store versions as integer constants in {{Server.java}}, > and we still have a fair bit of hard-coded numbers in the code, especially in > tests. I plan to clean this up by introducing a {{ProtocolVersion}} enum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12838) Extend native protocol flags and add supported versions to the SUPPORTED response
[ https://issues.apache.org/jira/browse/CASSANDRA-12838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-12838: --- Status: Ready to Commit (was: Patch Available) > Extend native protocol flags and add supported versions to the SUPPORTED > response > - > > Key: CASSANDRA-12838 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12838 > Project: Cassandra > Issue Type: Sub-task > Components: CQL >Reporter: Stefania >Assignee: Stefania > Fix For: 3.x > > > We already use 7 bits for the flags of the QUERY message, and since they are > encoded with a fixed size byte, we may be forced to change the structure of > the message soon, and I'd like to do this in version 5 but without wasting > bytes on the wire. Therefore, I propose to convert fixed flag's bytes to > unsigned vints, as defined in CASSANDRA-9499. The only exception would be the > flags in the frame, which should stay as fixed size. > Up to 7 bits, vints are encoded the same as bytes are, so no immediate change > would be required in the drivers, although they should plan to support vint > flags if supporting version 5. Moving forward, when a new flag is required > for the QUERY message, and eventually when other flags reach 8 bits in other > messages too, the flag's bitmaps would be automatically encoded with a size > that is big enough to accommodate all flags, but no bigger than required. We > can currently support up to 8 bytes with unsigned vints. > The downside is that drivers need to implement unsigned vint encoding for > version 5, but this is already required by CASSANDRA-11873, and will most > likely be required by CASSANDRA-11622 as well. > I would also like to add the list of versions to the SUPPORTED message, in > order to simplify the handshake for drivers that prefer to send an OPTION > message, rather than rely on receiving an error for an unsupported version in > the STARTUP message. Said error should also contain the full list of > supported versions, not just the min and max, for clarity, and because the > latest version is now a beta version. > Finally, we currently store versions as integer constants in {{Server.java}}, > and we still have a fair bit of hard-coded numbers in the code, especially in > tests. I plan to clean this up by introducing a {{ProtocolVersion}} enum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12101) DESCRIBE INDEX: missing quotes for case-sensitive index name
[ https://issues.apache.org/jira/browse/CASSANDRA-12101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614707#comment-15614707 ] Sam Tunnicliffe commented on CASSANDRA-12101: - Bah! Sorry, I should've found this rather than opening CASSANDRA-12847. Apologies for the duplication. > DESCRIBE INDEX: missing quotes for case-sensitive index name > > > Key: CASSANDRA-12101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12101 > Project: Cassandra > Issue Type: Bug >Reporter: Julien >Assignee: Stefania >Priority: Minor > Labels: cqlsh, lhf > > Create a custom index with a case-sensitive name. > The result of the DESCRIBE INDEX command does not have quotes around the > index name. As a result, the index cannot be recreated with this output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614667#comment-15614667 ] Arvind Nithrakashyap edited comment on CASSANDRA-12539 at 10/28/16 7:54 AM: It looks like a crash while writing the logfile can lead to a zero byte file. The following patch, which causes a crash after creating the buffer but before writing the header, produces a commit log full of zeros. This leads to the error described above when it tries to replay the commit log. {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 0a03c3c..20cddf8 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -153,6 +153,7 @@ public abstract class CommitLogSegment descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass()); logFile = new File(commitLog.location, descriptor.fileName()); +logger.error("location="+descriptor.fileName()); try { channel = FileChannel.open(logFile.toPath(), StandardOpenOption.WRITE, StandardOpenOption.READ, StandardOpenOption.CREATE); @@ -164,6 +165,9 @@ public abstract class CommitLogSegment } buffer = createBuffer(commitLog); +if (true) { + throw new IllegalArgumentException("Here!"); +} // write the header CommitLogDescriptor.writeHeader(buffer, descriptor); endOfBuffer = buffer.capacity(); {noformat} was (Author: anithrak): It looks like a crash while writing the logfile can lead to a zero byte file. The following patch which causes a crash at a specific point reliably produces a commit log full of zeros {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 0a03c3c..20cddf8 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -153,6 +153,7 @@ public abstract class CommitLogSegment descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass()); logFile = new File(commitLog.location, descriptor.fileName()); +logger.error("location="+descriptor.fileName()); try { channel = FileChannel.open(logFile.toPath(), StandardOpenOption.WRITE, StandardOpenOption.READ, StandardOpenOption.CREATE); @@ -164,6 +165,9 @@ public abstract class CommitLogSegment } buffer = createBuffer(commitLog); +if (true) { + throw new IllegalArgumentException("Here!"); +} // write the header CommitLogDescriptor.writeHeader(buffer, descriptor); endOfBuffer = buffer.capacity(); {noformat} > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfig
[jira] [Commented] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614667#comment-15614667 ] Arvind Nithrakashyap commented on CASSANDRA-12539: -- It looks like a crash while writing the logfile can lead to a zero byte file. The following patch which causes a crash at a specific point reliably produces a commit log full of zeros {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 0a03c3c..20cddf8 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -153,6 +153,7 @@ public abstract class CommitLogSegment descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass()); logFile = new File(commitLog.location, descriptor.fileName()); +logger.error("location="+descriptor.fileName()); try { channel = FileChannel.open(logFile.toPath(), StandardOpenOption.WRITE, StandardOpenOption.READ, StandardOpenOption.CREATE); @@ -164,6 +165,9 @@ public abstract class CommitLogSegment } buffer = createBuffer(commitLog); +if (true) { + throw new IllegalArgumentException("Here!"); +} // write the header CommitLogDescriptor.writeHeader(buffer, descriptor); endOfBuffer = buffer.capacity(); {noformat} > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfigurationLoader.java:85 - > Configuration location: file:/etc/cassandra/cassandra.yaml > {noformat} > Deleting the empty file fixes the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12539) Empty CommitLog prevents restart
[ https://issues.apache.org/jira/browse/CASSANDRA-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614670#comment-15614670 ] Arvind Nithrakashyap commented on CASSANDRA-12539: -- The following patch seems to fix the issue {noformat} diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index af8efb4..8f11d13 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@ -226,7 +226,7 @@ public class CommitLogReplayer { if (end != 0 || filecrc != 0) { -handleReplayError(false, +handleReplayError(tolerateTruncation, "Encountered bad header at position %d of commit log %s, with invalid CRC. " + "The end of segment marker should be zero.", offset, reader.getPath()); @@ -345,6 +345,14 @@ public class CommitLogReplayer return; } + +int currentFilePointer = (int) reader.getFilePointer(); +if (readSyncMarker(desc, currentFilePointer, reader, true) < 0) { +logger.info("Skipping empty logfile {}", file.getName()); +return; +} +reader.seek(currentFilePointer); + final long segmentId = desc.id; try { {noformat} > Empty CommitLog prevents restart > > > Key: CASSANDRA-12539 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12539 > Project: Cassandra > Issue Type: Bug >Reporter: Stefano Ortolani > > A node just crashed (known cause: CASSANDRA-11594) but to my surprise (unlike > other time) restarting simply fails. > Checking the logs showed: > {noformat} > ERROR [main] 2016-08-25 17:05:22,611 JVMStabilityInspector.java:82 - Exiting > due to error while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Could not read commit log descriptor in file > /data/cassandra/commitlog/CommitLog-6-1468235564433.log > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:650) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:327) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:148) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:289) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) > [apache-cassandra-3.0.8.jar:3.0.8] > INFO [main] 2016-08-25 17:08:56,944 YamlConfigurationLoader.java:85 - > Configuration location: file:/etc/cassandra/cassandra.yaml > {noformat} > Deleting the empty file fixes the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)