[jira] [Commented] (CASSANDRA-8568) Impose new API on data tracker modifications that makes correct usage obvious and imposes safety
[ https://issues.apache.org/jira/browse/CASSANDRA-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381637#comment-14381637 ] Marcus Eriksson commented on CASSANDRA-8568: [~benedict] could you rebase? And there seems to be some merge problems in build.xml Impose new API on data tracker modifications that makes correct usage obvious and imposes safety Key: CASSANDRA-8568 URL: https://issues.apache.org/jira/browse/CASSANDRA-8568 Project: Cassandra Issue Type: Bug Reporter: Benedict Assignee: Benedict Fix For: 3.0 DataTracker has become a bit of a quagmire, and not at all obvious to interface with, with many subtly different modifiers. I suspect it is still subtly broken, especially around error recovery. I propose piggy-backing on CASSANDRA-7705 to offer RAII (and GC-enforced, for those situations where a try/finally block isn't possible) objects that have transactional behaviour, and with few simple declarative methods that can be composed simply to provide all of the functionality we currently need. See CASSANDRA-8399 for context -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9023) 2.0.13 write timeouts on driver
[ https://issues.apache.org/jira/browse/CASSANDRA-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381660#comment-14381660 ] anishek commented on CASSANDRA-9023: I have retested it and I can recreate it with the same configuration as above. In the log files though there are no Exceptions. One thing i noticed is if i change the memtable_flush_writers to 2 the error did not come for the run. 2.0.13 write timeouts on driver --- Key: CASSANDRA-9023 URL: https://issues.apache.org/jira/browse/CASSANDRA-9023 Project: Cassandra Issue Type: Bug Environment: For testing using only Single node hardware configuration as follows: cpu : CPU(s):16 On-line CPU(s) list: 0-15 Thread(s) per core:2 Core(s) per socket:8 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU MHz: 2000.174 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-15 OS: Linux version 2.6.32-504.8.1.el6.x86_64 (mockbu...@c6b9.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) Disk: There only single disk in Raid i think space is 500 GB used is 5 GB Reporter: anishek Attachments: out_system.log Initially asked @ http://www.mail-archive.com/user@cassandra.apache.org/msg41621.html Was suggested to post here. If any more details are required please let me know -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8180) Optimize disk seek using min/max colunm name meta data when the LIMIT clause is used
[ https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381674#comment-14381674 ] Sylvain Lebresne commented on CASSANDRA-8180: - bq. l I found it much easier to understand Glad that it's the case. bq. I think it might make sense if I implement this change directly on a branch based on {{8099_engine_refactor}} I wouldn't be the one to blame you for that. bq. I cannot find a way to implement this unless we iterate twice, the first time to count until the limit has been reached in {{SinglePartitionSliceCommand}} and the second time to return the data You actually don't have to care about the limit (in SinglePartitionSliceCommand at least). The way to do this would be to return an iterator that first query and return the results of the first sstable and once it has returned all results, it transparently query the 2nd sstable and start returning those results, etc... That being said, I do suspect doing this at the merging level (in MergeIterator) would be better. The idea would be to special the merge iterator to take specific iterators that expose some {{lowerBound()}} method. That method would be allowed to return a value that is not returned by the iterator but is lower than anything it will return. The merge iterator would use those lower bound as initial {{Candidate}} for the iterators but know that when it consumes those canditates it should just discard them (and get the actual next value of the iterator). Basically, we'd add a way for the iterator to say don't bother using me until you've at least reached value X. The sstable iterators would typically implement that {{lowerBound}} method by returning the sstable min column name. Provided we make sure the sstable iterators don't do any work unless their {{hasNext/next}} methods are called, we wouldn't actually use a sstable until we've reached it's min column name. Doing it that way would 2 advantages over doing it at the collation level: # this is more general as it would work even if the sstables min/max column name intersects (it's harder/uglier to do the same at the collation level imo). # this would work for range queries too. We may want to build that on top of CASSANDRA-8915 however. Optimize disk seek using min/max colunm name meta data when the LIMIT clause is used Key: CASSANDRA-8180 URL: https://issues.apache.org/jira/browse/CASSANDRA-8180 Project: Cassandra Issue Type: Improvement Components: Core Environment: Cassandra 2.0.10 Reporter: DOAN DuyHai Assignee: Stefania Priority: Minor Fix For: 3.0 I was working on an example of sensor data table (timeseries) and face a use case where C* does not optimize read on disk. {code} cqlsh:test CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) WITH CLUSTERING ORDER BY (col DESC); cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10'); ... nodetool flush test test ... cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20'); ... nodetool flush test test ... cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30'); ... nodetool flush test test {code} After that, I activate request tracing: {code} cqlsh:test SELECT * FROM test WHERE id=1 LIMIT 1; activity | timestamp| source| source_elapsed ---+--+---+ execute_cql3_query | 23:48:46,498 | 127.0.0.1 | 0 Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 23:48:46,498 | 127.0.0.1 | 74 Preparing statement | 23:48:46,499 | 127.0.0.1 |253 Executing single-partition query on test | 23:48:46,499 | 127.0.0.1 |930 Acquiring sstable references | 23:48:46,499 | 127.0.0.1 |943 Merging memtable tombstones | 23:48:46,499 | 127.0.0.1 | 1032 Key cache hit for sstable 3 | 23:48:46,500 | 127.0.0.1 | 1160 Seeking to partition beginning in data file | 23:48:46,500 | 127.0.0.1 | 1173 Key cache hit for sstable 2 | 23:48:46,500 | 127.0.0.1 | 1889 Seeking to partition beginning in data file | 23:48:46,500 |
[jira] [Assigned] (CASSANDRA-9036) disk full when running cleanup (on a far from full disk)
[ https://issues.apache.org/jira/browse/CASSANDRA-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reassigned CASSANDRA-9036: -- Assignee: Robert Stupp (was: Marcus Eriksson) [~snazy] could you have a look? I think it could be related to CASSANDRA-7386 disk full when running cleanup (on a far from full disk) -- Key: CASSANDRA-9036 URL: https://issues.apache.org/jira/browse/CASSANDRA-9036 Project: Cassandra Issue Type: Bug Reporter: Erik Forsberg Assignee: Robert Stupp I'm trying to run cleanup, but get this: {noformat} INFO [CompactionExecutor:18] 2015-03-25 10:29:16,355 CompactionManager.java (line 564) Cleaning up SSTableReader(path='/cassandra/production/Data_daily/production-Data_daily-jb-4345750-Data.db') ERROR [CompactionExecutor:18] 2015-03-25 10:29:16,664 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:18,1,main] java.io.IOException: disk full at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:567) at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:281) at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:225) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} Now that's odd, since: * Disk has some 680G left * The sstable it's trying to cleanup is far less than 680G: {noformat} # ls -lh *4345750* -rw-r--r-- 1 cassandra cassandra 64M Mar 21 04:42 production-Data_daily-jb-4345750-CompressionInfo.db -rw-r--r-- 1 cassandra cassandra 219G Mar 21 04:42 production-Data_daily-jb-4345750-Data.db -rw-r--r-- 1 cassandra cassandra 503M Mar 21 04:42 production-Data_daily-jb-4345750-Filter.db -rw-r--r-- 1 cassandra cassandra 42G Mar 21 04:42 production-Data_daily-jb-4345750-Index.db -rw-r--r-- 1 cassandra cassandra 5.9K Mar 21 04:42 production-Data_daily-jb-4345750-Statistics.db -rw-r--r-- 1 cassandra cassandra 81M Mar 21 04:42 production-Data_daily-jb-4345750-Summary.db -rw-r--r-- 1 cassandra cassandra 79 Mar 21 04:42 production-Data_daily-jb-4345750-TOC.txt {noformat} Sure, it's large, but it's not 680G. No other compactions are running on that server. I'm getting this on 12 / 56 servers right now. Could it be some bug in the calculation of the expected size of the new sstable, perhaps? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8180) Optimize disk seek using min/max colunm name meta data when the LIMIT clause is used
[ https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381624#comment-14381624 ] Stefania commented on CASSANDRA-8180: - [~slebresne], [~thobbs], [~iamaleksey] : I think it might make sense if I implement this change directly on a branch based on {{8099_engine_refactor}}? First of all I found it *much easier* to understand and secondly I don't particularly want to rebase or merge later on once 8099 is merged into trunk. Any concerns? I've been looking at the code on 8099 today, and I cannot find a way to implement this unless we iterate twice, the first time to count until the limit has been reached in {{SinglePartitionSliceCommand}} and the second time to return the data. Or have I missed something? If not, I think we need to store the data in memory via an {{ArrayBackedPartition}}, is this correct? Here is a very inefficient and ugly way to do this, may I have some pointers on to improve on it? https://github.com/stef1927/cassandra/commits/8180-8099 Specifically in {{querySSTablesByClustering()}} at line 254 of {{SinglePartitionSliceCommand.java}}. Optimize disk seek using min/max colunm name meta data when the LIMIT clause is used Key: CASSANDRA-8180 URL: https://issues.apache.org/jira/browse/CASSANDRA-8180 Project: Cassandra Issue Type: Improvement Components: Core Environment: Cassandra 2.0.10 Reporter: DOAN DuyHai Assignee: Stefania Priority: Minor Fix For: 3.0 I was working on an example of sensor data table (timeseries) and face a use case where C* does not optimize read on disk. {code} cqlsh:test CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) WITH CLUSTERING ORDER BY (col DESC); cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10'); ... nodetool flush test test ... cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20'); ... nodetool flush test test ... cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30'); ... nodetool flush test test {code} After that, I activate request tracing: {code} cqlsh:test SELECT * FROM test WHERE id=1 LIMIT 1; activity | timestamp| source| source_elapsed ---+--+---+ execute_cql3_query | 23:48:46,498 | 127.0.0.1 | 0 Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 23:48:46,498 | 127.0.0.1 | 74 Preparing statement | 23:48:46,499 | 127.0.0.1 |253 Executing single-partition query on test | 23:48:46,499 | 127.0.0.1 |930 Acquiring sstable references | 23:48:46,499 | 127.0.0.1 |943 Merging memtable tombstones | 23:48:46,499 | 127.0.0.1 | 1032 Key cache hit for sstable 3 | 23:48:46,500 | 127.0.0.1 | 1160 Seeking to partition beginning in data file | 23:48:46,500 | 127.0.0.1 | 1173 Key cache hit for sstable 2 | 23:48:46,500 | 127.0.0.1 | 1889 Seeking to partition beginning in data file | 23:48:46,500 | 127.0.0.1 | 1901 Key cache hit for sstable 1 | 23:48:46,501 | 127.0.0.1 | 2373 Seeking to partition beginning in data file | 23:48:46,501 | 127.0.0.1 | 2384 Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 23:48:46,501 | 127.0.0.1 | 2768 Merging data from memtables and 3 sstables | 23:48:46,501 | 127.0.0.1 | 2784 Read 2 live and 0 tombstoned cells | 23:48:46,501 | 127.0.0.1 | 2976 Request complete | 23:48:46,501 | 127.0.0.1 | 3551 {code} We can clearly see that C* hits 3 SSTables on disk instead of just one, although it has the min/max column meta data to decide which SSTable contains the most recent data. Funny enough, if we add a clause on the clustering column to the select, this time C* optimizes the read path: {code} cqlsh:test SELECT * FROM test WHERE id=1 AND col 25 LIMIT 1; activity
[jira] [Commented] (CASSANDRA-8899) cqlsh - not able to get row count with select(*) for large table
[ https://issues.apache.org/jira/browse/CASSANDRA-8899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381582#comment-14381582 ] Benjamin Lerer commented on CASSANDRA-8899: --- [~jeffl] Could you check the effect of increasing the timeout? By default the page size is of 10 000 rows and count is performed on the coordinator node. This means that if the data are not on your coordinator node it will have to send at least 5 queries (more if the data are distributed over several nodes) to the other nodes. Depending how far your nodes are from the coordinator the latency can add up pretty quickly. The best way for you to verify this theory will be to use request tracing: http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 cqlsh - not able to get row count with select(*) for large table Key: CASSANDRA-8899 URL: https://issues.apache.org/jira/browse/CASSANDRA-8899 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.2 ubuntu12.04 Reporter: Jeff Liu Assignee: Benjamin Lerer I'm getting errors when running a query that looks at a large number of rows. {noformat} cqlsh:events select count(*) from catalog; count --- 1 (1 rows) cqlsh:events select count(*) from catalog limit 11000; count --- 11000 (1 rows) cqlsh:events select count(*) from catalog limit 5; errors={}, last_host=127.0.0.1 cqlsh:events {noformat} We are not able to make the select * query to get row count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive
[ https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-9033: --- Priority: Major (was: Blocker) Lowering prio as the actual problem is that you have that many tiny files on your node. Question is how you ended up with that many files. Did you run repairs prior to the number of files exploded? Do you have graphs over how many files you have on the node? Is there a gradual increase over time or did it happen over night? Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive --- Key: CASSANDRA-9033 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033 Project: Cassandra Issue Type: Bug Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters] * 12 nodes running a mix of 2.1.1 and 2.1.3 * 8GB stack size with offheap objects Reporter: Brent Haines Assignee: Marcus Eriksson Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip We have an Event Log table using LCS that has grown fast. There are more than 100K sstable files that are around 1KB. Increasing compactors and adjusting compaction throttling upward doesn't make a difference. It has been running great though until we upgraded to 2.1.3. Those nodes needed more RAM for the stack (12 GB) to even have a prayer of responding to queries. They bog down and become unresponsive. There are no GC messages that I can see, and no compaction either. The only work-around I have found is to decommission, blow away the big CF and rejoin. That happens in about 20 minutes and everything is freaking happy again. The size of the files is more like what I'd expect as well. Our schema: {code} cqlsh describe columnfamily data.stories CREATE TABLE data.stories ( id timeuuid PRIMARY KEY, action_data timeuuid, action_name text, app_id timeuuid, app_instance_id timeuuid, data maptext, text, objects settimeuuid, time_stamp timestamp, user_id timeuuid ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Stories represent the timeline and are placed in the dashboard for the brand manager to see' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh {code} There were no log entries that stood out. It pretty much consisted of x is down x is up repeated ad infinitum. I have attached the zipped system.log that has the situation after the upgrade and then after I stopped, removed system, system_traces, OpsCenter, and data/stories-/* and restarted. It has rejoined the cluster now and is busy read-repairing to recover its data. On another note, we see a lot of this during repair now (on all the nodes): {code} ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) ~[apache-cassandra-2.1.3.jar:2.1.3] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55] ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 CassandraDaemon.java:167 - Exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Failed during snapshot creation. at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
[jira] [Commented] (CASSANDRA-9028) Optimize LIMIT execution to mitigate need for a full partition scan
[ https://issues.apache.org/jira/browse/CASSANDRA-9028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381688#comment-14381688 ] Sylvain Lebresne commented on CASSANDRA-9028: - Well, the trace does says that all sstables have been touched as you said, and they have, but touching a sstable is world away from reading the entire partition in memory. The reason your first query does touch 2 sstables is that the code does not know which sstable will have results for the query, how much it will have nor which results will sort first. This is not particularly abnormal, there is so much the storage engine can deduce without reading any data, but this doesn't change the fact that as little as possible is read in each sstable and we certainly don't retrieve entire partitions unless we have to. The reason the 2nd request actually only hit a single sstable is that this request is more restricted and the engine is able to use that additional restriction to eliminate one of the sstable. For completness sake, I'll note that there is actually some optimization we're contemplating in CASSANDRA-8180 to avoid touching sstables in some cases. This might or might not help your first query, I honestly haven't looked closely enough at the example to say. It won't make a terribly huge difference in any case. Optimize LIMIT execution to mitigate need for a full partition scan --- Key: CASSANDRA-9028 URL: https://issues.apache.org/jira/browse/CASSANDRA-9028 Project: Cassandra Issue Type: Improvement Components: API, Core Reporter: jonathan lacefield Attachments: Data.1.json, Data.2.json, Data.3.json, test.ddl, tracing.out Currently, a SELECT statement for a single Partition Key that contains a LIMIT X clause will fetch an entire partition from a node and place the partition into memory prior to applying the limit clause and returning results to be served to the client via the coordinator. This JIRA is to request an optimization for the CQL LIMIT clause to avoid the entire partition retrieval step, and instead only retrieve the components to satisfy the LIMIT condition. Ideally, any LIMIT X would avoid the need to retrieve a full partition. This may not be possible though. As a compromise, it would still be incredibly beneficial if a LIMIT 1 clause could be optimized to only retrieve the latest item. Ideally a LIMIT 1 would operationally behave the same way as a Clustering Key WHERE clause where the latest, i.e. LIMIT 1 field, col value was specified. We can supply some trace results to help show the difference between 2 different queries that preform the same logical function if desired. For example, a query that returns the latest value for a clustering col where QUERY 1 uses a LIMIT 1 clause and QUERY 2 uses a WHERE clustering col = latest value -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9028) Optimize LIMIT execution to mitigate need for a full partition scan
[ https://issues.apache.org/jira/browse/CASSANDRA-9028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-9028. - Resolution: Not a Problem Optimize LIMIT execution to mitigate need for a full partition scan --- Key: CASSANDRA-9028 URL: https://issues.apache.org/jira/browse/CASSANDRA-9028 Project: Cassandra Issue Type: Improvement Components: API, Core Reporter: jonathan lacefield Attachments: Data.1.json, Data.2.json, Data.3.json, test.ddl, tracing.out Currently, a SELECT statement for a single Partition Key that contains a LIMIT X clause will fetch an entire partition from a node and place the partition into memory prior to applying the limit clause and returning results to be served to the client via the coordinator. This JIRA is to request an optimization for the CQL LIMIT clause to avoid the entire partition retrieval step, and instead only retrieve the components to satisfy the LIMIT condition. Ideally, any LIMIT X would avoid the need to retrieve a full partition. This may not be possible though. As a compromise, it would still be incredibly beneficial if a LIMIT 1 clause could be optimized to only retrieve the latest item. Ideally a LIMIT 1 would operationally behave the same way as a Clustering Key WHERE clause where the latest, i.e. LIMIT 1 field, col value was specified. We can supply some trace results to help show the difference between 2 different queries that preform the same logical function if desired. For example, a query that returns the latest value for a clustering col where QUERY 1 uses a LIMIT 1 clause and QUERY 2 uses a WHERE clustering col = latest value -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage
[ https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381722#comment-14381722 ] Benedict commented on CASSANDRA-8670: - bq. What tool are you using to review? I like to navigate in IntelliJ, and on the command line, so having a clean run of commits helps a lot. After a bit of consideration, I think there's a good justification for introducing a whole new class if we intend to fully replace DataStreamOutputAndChannel, largely because the two write paths are not at all clear, and appear to be different (the old versions of the write paths being hard to actually pin down the location of in the VM source). So having a solid handle on how it behaves, and ensuring fewer code paths are executed, seems a good thing. As such, I think this patch should replace DSOaC entirely, and remove it from the codebase. I also think this is a good opportunity to share its code with DataOutputByteBuffer, and in doing hopefully make that faster, potentially improving performance of CL append (it doesn't need to extend AbstractDataOutput, and would share most of its implementation with NIODataOutputStream if it did not). A few comments in NIODataInputStream: * readNext() should assert it is never shuffling more than 7 bytes; in fact ideally this would be done by readMinimum() to make it clearer * readNext() should IMO never shuffle unless it's at the end of its capacity; if it hasRemaining() and limit() != capacity() it should read on from its current limit (readMinimum can ensure there is room to fully meet its requirements) * readUnsignedShort() could simply be: {{ return readShort() 0x;}} * available() should return the bytes in the buffer at least * ensureMinimum() isn't clearly named, since it is more intrinsically linked to primitive reads than it suggests, consuming the bytes and throwing EOF if it cannot read. Something like preparePrimitiveRead() (no fixed idea myself, just think it is more than ensureMinimum) A few comments in NIODataOutputStreamPlus: * close() should flush * close() should clean the buffer * why the use of hollowBuffer? For clarity in case of restoring the cursor position during exceptions? Would be helpful to clarify with a comment. It seems like perhaps this should only be used for the first branch, though, since the second should have no risk of throwing an exception, so we can safely restore the position. It seems like it might be best to make hollowBuffer default to null, and instantiate it only if it is larger than our buffer size, otherwise first flushing our internal buffer if we haven't got enough room. This way we should rarely need the hollowBuffer. * We should either extend our AbstractDataOutput, or make our writeUTF method public static, so we can share it Finally, it would be nice if we didn't need to stash the OutputStream version separately. Perhaps we can reorganise the class hierarchy, so that DataOutputStreamPlus doesn't wrap an internal OutputStream, it just is a light abstract class merge of the types OutputStream and DataOutputPlus. We can introduce a WrappedDataOutputStreamPlus in its place, and AbstractDataOutput could extend our new DataOutputStreamPlus instead of the other way around (with Wrapped... extending _it_). Then we can just stash a DataOutputStreamPlus in all cases. Sound reasonable? Large columns + NIO memory pooling causes excessive direct memory usage --- Key: CASSANDRA-8670 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: largecolumn_test.py If you provide a large byte array to NIO and ask it to populate the byte array from a socket it will allocate a thread local byte buffer that is the size of the requested read no matter how large it is. Old IO wraps new IO for sockets (but not files) so old IO is effected as well. Even If you are using Buffered{Input | Output}Stream you can end up passing a large byte array to NIO. The byte array read method will pass the array to NIO directly if it is larger than the internal buffer. Passing large cells between nodes as part of intra-cluster messaging can cause the NIO pooled buffers to quickly reach a high watermark and stay there. This ends up costing 2x the largest cell size because there is a buffer for input and output since they are different threads. This is further multiplied by the number of nodes in the cluster - 1 since each has a dedicated thread pair with separate thread locals. Anecdotally it appears that the cost is doubled beyond that although it isn't clear why. Possibly the control
[jira] [Commented] (CASSANDRA-8568) Impose new API on data tracker modifications that makes correct usage obvious and imposes safety
[ https://issues.apache.org/jira/browse/CASSANDRA-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381773#comment-14381773 ] Benedict commented on CASSANDRA-8568: - Yep, will do. I'm rebasing CASSANDRA-8984 onto trunk, and then will rebase this onto that, since it's likely that will be committed soon(ish). Impose new API on data tracker modifications that makes correct usage obvious and imposes safety Key: CASSANDRA-8568 URL: https://issues.apache.org/jira/browse/CASSANDRA-8568 Project: Cassandra Issue Type: Bug Reporter: Benedict Assignee: Benedict Fix For: 3.0 DataTracker has become a bit of a quagmire, and not at all obvious to interface with, with many subtly different modifiers. I suspect it is still subtly broken, especially around error recovery. I propose piggy-backing on CASSANDRA-7705 to offer RAII (and GC-enforced, for those situations where a try/finally block isn't possible) objects that have transactional behaviour, and with few simple declarative methods that can be composed simply to provide all of the functionality we currently need. See CASSANDRA-8399 for context -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state
[ https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381708#comment-14381708 ] Benedict commented on CASSANDRA-8984: - bq. in a stable release. Well, our release page doesn't quite agree with this implicit assertion (that 2.1 is stable) - but like I say, we can accept the risk as stands and just try to patch it up as necessary. I'm more keen to fix them than others since I've taken the heat of the failures, but I'm comfortable so long as I've put my version of the future out there and highlighted my concerns. [~JoshuaMcKenzie]: I've pushed a small update that I expect fixes the Windows issue (though looking forward to automated branch testing so I can corroborate against Windows directly) Introduce Transactional API for behaviours that can corrupt system state Key: CASSANDRA-8984 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.4 Attachments: 8984_windows_timeout.txt As a penultimate (and probably final for 2.1, if we agree to introduce it there) round of changes to the internals managing sstable writing, I've introduced a new API called Transactional that I hope will make it much easier to write correct behaviour. As things stand we conflate a lot of behaviours into methods like close - the recent changes unpicked some of these, but didn't go far enough. My proposal here introduces an interface designed to support four actions (on top of their normal function): * prepareToCommit * commit * abort * cleanup In normal operation, once we have finished constructing a state change we call prepareToCommit; once all such state changes are prepared, we call commit. If at any point everything fails, abort is called. In _either_ case, cleanup is called at the very last. These transactional objects are all AutoCloseable, with the behaviour being to rollback any changes unless commit has completed successfully. The changes are actually less invasive than it might sound, since we did recently introduce abort in some places, as well as have commit like methods. This simply formalises the behaviour, and makes it consistent between all objects that interact in this way. Much of the code change is boilerplate, such as moving an object into a try-declaration, although the change is still non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we have had since 2.1 was released. The data tracker API changes and compaction leftover cleanups should finish the job with making this much easier to reason about, but this change I think is worthwhile considering for 2.1, since we've just overhauled this entire area (and not released these changes), and this change is essentially just the finishing touches, so the risk is minimal and the potential gains reasonably significant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9037) Terminal UDFs evaluated at prepare time throw protocol version error
[ https://issues.apache.org/jira/browse/CASSANDRA-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381759#comment-14381759 ] Sam Tunnicliffe commented on CASSANDRA-9037: Thanks Tyler, I've pushed another commit to the branch with additional tests as requested. The changes to CqlRecordReader aren't unrelated, its inner WrappedRow class implements com.datastax.driver.core.Row which has been extended since version 2.1.2 of the driver (in [5c1e121f|https://github.com/datastax/java-driver/commit/5c1e121f0cc6e39e4d1349bb30f409ae486b3d97#diff-3b8ffce5c217f9096226305ecfd5a49a] [a0c42dd2|https://github.com/datastax/java-driver/commit/a0c42dd24f65d3b6e7c558dce68ae1b48c6da7f7]) Terminal UDFs evaluated at prepare time throw protocol version error Key: CASSANDRA-9037 URL: https://issues.apache.org/jira/browse/CASSANDRA-9037 Project: Cassandra Issue Type: Bug Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Fix For: 3.0 When a pure function with only terminal arguments (or with no arguments) is used in a where clause, it's executed at prepare time and {{Server.CURRENT_VERSION}} passed as the protocol version for serialization purposes. For native functions, this isn't a problem, but UDFs use classes in the bundled java-driver-core jar for (de)serialization of args and return values. When {{Server.CURRENT_VERSION}} is greater than the highest version supported by the bundled java driver the execution fails with the following exception: {noformat} ERROR [SharedPool-Worker-1] 2015-03-24 18:10:59,391 QueryMessage.java:132 - Unexpected error during query org.apache.cassandra.exceptions.FunctionExecutionException: execution of 'ks.overloaded[text]' failed: java.lang.IllegalArgumentException: No protocol version matching integer version 4 at org.apache.cassandra.exceptions.FunctionExecutionException.create(FunctionExecutionException.java:35) ~[main/:na] at org.apache.cassandra.cql3.udf.gen.Cksoverloaded_1.execute(Cksoverloaded_1.java) ~[na:na] at org.apache.cassandra.cql3.functions.FunctionCall.executeInternal(FunctionCall.java:78) ~[main/:na] at org.apache.cassandra.cql3.functions.FunctionCall.access$200(FunctionCall.java:34) ~[main/:na] at org.apache.cassandra.cql3.functions.FunctionCall$Raw.execute(FunctionCall.java:176) ~[main/:na] at org.apache.cassandra.cql3.functions.FunctionCall$Raw.prepare(FunctionCall.java:161) ~[main/:na] at org.apache.cassandra.cql3.SingleColumnRelation.toTerm(SingleColumnRelation.java:108) ~[main/:na] at org.apache.cassandra.cql3.SingleColumnRelation.newEQRestriction(SingleColumnRelation.java:143) ~[main/:na] at org.apache.cassandra.cql3.Relation.toRestriction(Relation.java:127) ~[main/:na] at org.apache.cassandra.cql3.restrictions.StatementRestrictions.init(StatementRestrictions.java:126) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepareRestrictions(SelectStatement.java:787) ~[main/:na] at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:740) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:488) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:252) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:246) ~[main/:na] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119) ~[main/:na] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:475) [main/:na] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:371) [main/:na] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) [netty-all-4.0.23.Final.jar:4.0.23.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_71] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) [main/:na] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [main/:na] at
[jira] [Commented] (CASSANDRA-8917) Upgrading from 2.0.9 to 2.1.3 with 3 nodes, CL = quorum causes exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381768#comment-14381768 ] Gary Ogden commented on CASSANDRA-8917: --- We haven't attempted the upgrade again since we ran into this issue. Upgrading from 2.0.9 to 2.1.3 with 3 nodes, CL = quorum causes exceptions - Key: CASSANDRA-8917 URL: https://issues.apache.org/jira/browse/CASSANDRA-8917 Project: Cassandra Issue Type: Bug Environment: C* 2.0.9, Centos 6.5, Java 1.7.0_72, spring data cassandra 1.1.1, cassandra java driver 2.0.9 Reporter: Gary Ogden Fix For: 2.1.4 Attachments: b_output.log, jersey_error.log, node1-cassandra.yaml, node1-system.log, node2-cassandra.yaml, node2-system.log, node3-cassandra.yaml, node3-system.log We have java apps running on glassfish that read/write to our 3 node cluster running on 2.0.9. we have the CL set to quorum for all reads and writes. When we started to upgrade the first node and did the sstable upgrade on that node, we started getting this error on reads and writes: com.datastax.driver.core.exceptions.UnavailableException: Not enough replica available for query at consistency QUORUM (2 required but only 1 alive) How is that possible when we have 3 nodes total, and there was 2 that were up and it's saying we can't get the required CL? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9023) 2.0.13 write timeouts on driver
[ https://issues.apache.org/jira/browse/CASSANDRA-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9023: --- Fix Version/s: 2.0.14 2.0.13 write timeouts on driver --- Key: CASSANDRA-9023 URL: https://issues.apache.org/jira/browse/CASSANDRA-9023 Project: Cassandra Issue Type: Bug Environment: For testing using only Single node hardware configuration as follows: cpu : CPU(s):16 On-line CPU(s) list: 0-15 Thread(s) per core:2 Core(s) per socket:8 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU MHz: 2000.174 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-15 OS: Linux version 2.6.32-504.8.1.el6.x86_64 (mockbu...@c6b9.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) Disk: There only single disk in Raid i think space is 500 GB used is 5 GB Reporter: anishek Fix For: 2.0.14 Attachments: out_system.log Initially asked @ http://www.mail-archive.com/user@cassandra.apache.org/msg41621.html Was suggested to post here. If any more details are required please let me know -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-6477: Reviewer: Sam Tunnicliffe Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8150) Revaluate Default JVM tuning parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-8150: -- Assignee: Ryan McGuire (was: Brandon Williams) Revaluate Default JVM tuning parameters --- Key: CASSANDRA-8150 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Matt Stump Assignee: Ryan McGuire Attachments: upload.png It's been found that the old twitter recommendations of 100m per core up to 800m is harmful and should no longer be used. Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 1/3 is probably better for releases greater than 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-8893) RandomAccessReader should share its FileChannel with all instances (via SegmentedFile)
[ https://issues.apache.org/jira/browse/CASSANDRA-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-8893: - Assignee: Stefania (was: Benedict) Stefania, can you take a stab at this? RandomAccessReader should share its FileChannel with all instances (via SegmentedFile) -- Key: CASSANDRA-8893 URL: https://issues.apache.org/jira/browse/CASSANDRA-8893 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Fix For: 3.0 There's no good reason to open a FileChannel for each \(Compressed\)\?RandomAccessReader, and this would simplify RandomAccessReader to just a thin wrapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383232#comment-14383232 ] Hans van der Linde commented on CASSANDRA-8150: --- Dear sender, I am giving a training and will be back on Tuesday 31-03-2015 During breaks I hope to respond on your email. Your email will not be forwarded. For urgent matters regarding GRTC / RTPE contact Peter v/d Koolwijk (peter.van.de.koolw...@ing.nl / 06-54660211 Or alternatively my manager Coos v/d Berg (coos.van.den.b...@ing.nl / 06-22018780) Best regards, Hans van der Linde - ATTENTION: The information in this electronic mail message is private and confidential, and only intended for the addressee. Should you receive this message by mistake, you are hereby notified that any disclosure, reproduction, distribution or use of this message is strictly prohibited. Please inform the sender by reply transmission and delete the message without copying or opening it. Messages and attachments are scanned for all viruses known. If this message contains password-protected attachments, the files have NOT been scanned for viruses by the ING mail domain. Always scan attachments before opening them. - Revaluate Default JVM tuning parameters --- Key: CASSANDRA-8150 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Matt Stump Assignee: Ryan McGuire Attachments: upload.png It's been found that the old twitter recommendations of 100m per core up to 800m is harmful and should no longer be used. Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 1/3 is probably better for releases greater than 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-6680) Clock skew detection via gossip
[ https://issues.apache.org/jira/browse/CASSANDRA-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6680: -- Assignee: Stefania (was: Brandon Williams) Clock skew detection via gossip --- Key: CASSANDRA-6680 URL: https://issues.apache.org/jira/browse/CASSANDRA-6680 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Stefania Priority: Minor Fix For: 3.0 Gossip's HeartbeatState keeps the generation (local timestamp the node was started) and version (monotonically increasing per gossip interval) which could be used to roughly calculate the node's current time, enabling detection of gossip messages too far in the future for the clocks to be synced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5969: -- Assignee: Stefania (was: Brandon Williams) Allow JVM_OPTS to be passed to sstablescrub --- Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Assignee: Stefania Labels: lhf Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
Roman Tkachenko created CASSANDRA-9045: -- Summary: Deleted columns are resurrected after repair in wide rows Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Priority: Critical Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state
[ https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382307#comment-14382307 ] Joshua McKenzie commented on CASSANDRA-8984: Test on Windows are working after that last push. Given our other discussions about a new release cycle I think debating whether we consider 2.1 a stable release at this point or not will have a short shelf-life. I'm on the fence w/this change as it's largely a refactor of existing flow into codified objects but we're also late in the 2.1 release cycle for changes that touch this much of the code-base in this fashion. Introduce Transactional API for behaviours that can corrupt system state Key: CASSANDRA-8984 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.4 Attachments: 8984_windows_timeout.txt As a penultimate (and probably final for 2.1, if we agree to introduce it there) round of changes to the internals managing sstable writing, I've introduced a new API called Transactional that I hope will make it much easier to write correct behaviour. As things stand we conflate a lot of behaviours into methods like close - the recent changes unpicked some of these, but didn't go far enough. My proposal here introduces an interface designed to support four actions (on top of their normal function): * prepareToCommit * commit * abort * cleanup In normal operation, once we have finished constructing a state change we call prepareToCommit; once all such state changes are prepared, we call commit. If at any point everything fails, abort is called. In _either_ case, cleanup is called at the very last. These transactional objects are all AutoCloseable, with the behaviour being to rollback any changes unless commit has completed successfully. The changes are actually less invasive than it might sound, since we did recently introduce abort in some places, as well as have commit like methods. This simply formalises the behaviour, and makes it consistent between all objects that interact in this way. Much of the code change is boilerplate, such as moving an object into a try-declaration, although the change is still non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we have had since 2.1 was released. The data tracker API changes and compaction leftover cleanups should finish the job with making this much easier to reason about, but this change I think is worthwhile considering for 2.1, since we've just overhauled this entire area (and not released these changes), and this change is essentially just the finishing touches, so the risk is minimal and the potential gains reasonably significant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382304#comment-14382304 ] Philip Thompson commented on CASSANDRA-9045: I'm very interested in the cqlsh traces for the delete and select queries. It doesn't seem like a repair issue, so I'm unassigning Yuki Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Philip Thompson Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive
[ https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-9033. Resolution: Not a Problem Yes, it is always OK to change compaction strategy, if doing that corrupts your data, that is of course an actual issue (if you have logs or can reproduce it, please file a new ticket) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive --- Key: CASSANDRA-9033 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033 Project: Cassandra Issue Type: Bug Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters] * 12 nodes running a mix of 2.1.1 and 2.1.3 * 8GB stack size with offheap objects Reporter: Brent Haines Assignee: Marcus Eriksson Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip We have an Event Log table using LCS that has grown fast. There are more than 100K sstable files that are around 1KB. Increasing compactors and adjusting compaction throttling upward doesn't make a difference. It has been running great though until we upgraded to 2.1.3. Those nodes needed more RAM for the stack (12 GB) to even have a prayer of responding to queries. They bog down and become unresponsive. There are no GC messages that I can see, and no compaction either. The only work-around I have found is to decommission, blow away the big CF and rejoin. That happens in about 20 minutes and everything is freaking happy again. The size of the files is more like what I'd expect as well. Our schema: {code} cqlsh describe columnfamily data.stories CREATE TABLE data.stories ( id timeuuid PRIMARY KEY, action_data timeuuid, action_name text, app_id timeuuid, app_instance_id timeuuid, data maptext, text, objects settimeuuid, time_stamp timestamp, user_id timeuuid ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Stories represent the timeline and are placed in the dashboard for the brand manager to see' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh {code} There were no log entries that stood out. It pretty much consisted of x is down x is up repeated ad infinitum. I have attached the zipped system.log that has the situation after the upgrade and then after I stopped, removed system, system_traces, OpsCenter, and data/stories-/* and restarted. It has rejoined the cluster now and is busy read-repairing to recover its data. On another note, we see a lot of this during repair now (on all the nodes): {code} ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) ~[apache-cassandra-2.1.3.jar:2.1.3] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55] ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 CassandraDaemon.java:167 - Exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Failed during snapshot creation. at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_55] at
[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to restart Daemon without unloading classes
[ https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Hugonnet updated CASSANDRA-9046: - Summary: Allow Cassandra config to be updated to restart Daemon without unloading classes (was: Allow Cassandra config to be updated to restart Deaemon without unloading classes) Allow Cassandra config to be updated to restart Daemon without unloading classes Key: CASSANDRA-9046 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Emmanuel Hugonnet Fix For: 3.0 Attachments: 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch Make applyConfig public in DatabaseDescriptor so that if we embed C* we can restart it after some configuration change without having to stop the whole application to unload the class which is configured once and for all in a static block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows
[ https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-8979: -- Attachment: (was: cassandra-2.0-8979-validator_patch.txt) MerkleTree mismatch for deleted and non-existing rows - Key: CASSANDRA-8979 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979 Project: Cassandra Issue Type: Bug Components: Core Reporter: Stefan Podkowinski Assignee: Yuki Morishita Attachments: cassandra-2.0-8979-lazyrow_patch.txt, cassandra-2.0-8979-validator_patch.txt, cassandra-2.0-8979-validatortest_patch.txt, cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt Validation compaction will currently create different hashes for rows that have been deleted compared to nodes that have not seen the rows at all or have already compacted them away. In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to prevent hashing of expired tombstones. This still seems to be in place, but does not address the issue completely. Or there was a change in 2.0 that rendered the patch ineffective. The problem is that rowHash() in the Validator will return a new hash in any case, whether the PrecompactedRow did actually update the digest or not. This will lead to the case that a purged, PrecompactedRow will not change the digest, but we end up with a different tree compared to not having rowHash called at all (such as in case the row already doesn't exist). As an implication, repair jobs will constantly detect mismatches between older sstables containing purgable rows and nodes that have already compacted these rows. After transfering the reported ranges, the newly created sstables will immediately get deleted again during the following compaction. This will happen for each repair run over again until the sstable with the purgable row finally gets compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows
[ https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-8979: -- Attachment: cassandra-2.0-8979-validatortest_patch.txt cassandra-2.0-8979-validator_patch.txt cassandra-2.0-8979-lazyrow_patch.txt cassandra-2.1-8979-validator_patch.txt cassandra-2.1-8979-lazyrow_patch.txt MerkleTree mismatch for deleted and non-existing rows - Key: CASSANDRA-8979 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979 Project: Cassandra Issue Type: Bug Components: Core Reporter: Stefan Podkowinski Assignee: Yuki Morishita Attachments: cassandra-2.0-8979-lazyrow_patch.txt, cassandra-2.0-8979-validator_patch.txt, cassandra-2.0-8979-validatortest_patch.txt, cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt Validation compaction will currently create different hashes for rows that have been deleted compared to nodes that have not seen the rows at all or have already compacted them away. In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to prevent hashing of expired tombstones. This still seems to be in place, but does not address the issue completely. Or there was a change in 2.0 that rendered the patch ineffective. The problem is that rowHash() in the Validator will return a new hash in any case, whether the PrecompactedRow did actually update the digest or not. This will lead to the case that a purged, PrecompactedRow will not change the digest, but we end up with a different tree compared to not having rowHash called at all (such as in case the row already doesn't exist). As an implication, repair jobs will constantly detect mismatches between older sstables containing purgable rows and nodes that have already compacted these rows. After transfering the reported ranges, the newly created sstables will immediately get deleted again during the following compaction. This will happen for each repair run over again until the sstable with the purgable row finally gets compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive
[ https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-9033. Resolution: Duplicate Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive --- Key: CASSANDRA-9033 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033 Project: Cassandra Issue Type: Bug Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters] * 12 nodes running a mix of 2.1.1 and 2.1.3 * 8GB stack size with offheap objects Reporter: Brent Haines Assignee: Marcus Eriksson Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip We have an Event Log table using LCS that has grown fast. There are more than 100K sstable files that are around 1KB. Increasing compactors and adjusting compaction throttling upward doesn't make a difference. It has been running great though until we upgraded to 2.1.3. Those nodes needed more RAM for the stack (12 GB) to even have a prayer of responding to queries. They bog down and become unresponsive. There are no GC messages that I can see, and no compaction either. The only work-around I have found is to decommission, blow away the big CF and rejoin. That happens in about 20 minutes and everything is freaking happy again. The size of the files is more like what I'd expect as well. Our schema: {code} cqlsh describe columnfamily data.stories CREATE TABLE data.stories ( id timeuuid PRIMARY KEY, action_data timeuuid, action_name text, app_id timeuuid, app_instance_id timeuuid, data maptext, text, objects settimeuuid, time_stamp timestamp, user_id timeuuid ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Stories represent the timeline and are placed in the dashboard for the brand manager to see' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh {code} There were no log entries that stood out. It pretty much consisted of x is down x is up repeated ad infinitum. I have attached the zipped system.log that has the situation after the upgrade and then after I stopped, removed system, system_traces, OpsCenter, and data/stories-/* and restarted. It has rejoined the cluster now and is busy read-repairing to recover its data. On another note, we see a lot of this during repair now (on all the nodes): {code} ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) ~[apache-cassandra-2.1.3.jar:2.1.3] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55] ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 CassandraDaemon.java:167 - Exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Failed during snapshot creation. at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_55] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_55] at
[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9045: --- Assignee: Yuki Morishita Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Yuki Morishita Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382278#comment-14382278 ] Philip Thompson commented on CASSANDRA-9045: What CL are you deleting at? Can you attach a system log of a node undergoing the repair? Possibly at DEBUG? Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382278#comment-14382278 ] Philip Thompson edited comment on CASSANDRA-9045 at 3/26/15 5:31 PM: - To confirm, the delete is at LOCAL_QUORUM? Can you attach a system log of a node undergoing the repair? Possibly at DEBUG? Do you see the issue if you delete at ALL? was (Author: philipthompson): What CL are you deleting at? Can you attach a system log of a node undergoing the repair? Possibly at DEBUG? Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Yuki Morishita Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382278#comment-14382278 ] Philip Thompson edited comment on CASSANDRA-9045 at 3/26/15 5:33 PM: - To confirm, the delete is at LOCAL_QUORUM? Can you attach a system log of a node undergoing the repair? Possibly at DEBUG? Do you see the issue if you delete at ALL? How long are the repairs taking? Is it within gc_grace? Can you attach traces of the delete query, and then the select query that returns the deleted entry? was (Author: philipthompson): To confirm, the delete is at LOCAL_QUORUM? Can you attach a system log of a node undergoing the repair? Possibly at DEBUG? Do you see the issue if you delete at ALL? Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Yuki Morishita Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382298#comment-14382298 ] Roman Tkachenko commented on CASSANDRA-9045: Hi Philip - thanks for quick response. Yes, normally the delete is LOCAL_QUORUM, but in my tests I was using ALL as well, with the same results. Let me see if I can enable DEBUG logging and run repair again. That's gonna be a lot of logs, I imagine... Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Yuki Morishita Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows
[ https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382217#comment-14382217 ] Stefan Podkowinski commented on CASSANDRA-8979: --- I think the main problem here is that digest.update() is always called with the top-level row tombstone even if no tombstone exists. In this case digest is updated with DeletionTime.LIVE which doesn't seem to be correct. Patches have been update, but couldn't test 2.1 yet. I've also created a test cluster [bootstrap script|https://github.com/spodkowinski/phantom-cabinet-cases/tree/master/CASSANDRA-8979] to reproduce the problem for 2.0. MerkleTree mismatch for deleted and non-existing rows - Key: CASSANDRA-8979 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979 Project: Cassandra Issue Type: Bug Components: Core Reporter: Stefan Podkowinski Assignee: Yuki Morishita Attachments: cassandra-2.0-8979-lazyrow_patch.txt, cassandra-2.0-8979-validator_patch.txt, cassandra-2.0-8979-validatortest_patch.txt, cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt Validation compaction will currently create different hashes for rows that have been deleted compared to nodes that have not seen the rows at all or have already compacted them away. In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to prevent hashing of expired tombstones. This still seems to be in place, but does not address the issue completely. Or there was a change in 2.0 that rendered the patch ineffective. The problem is that rowHash() in the Validator will return a new hash in any case, whether the PrecompactedRow did actually update the digest or not. This will lead to the case that a purged, PrecompactedRow will not change the digest, but we end up with a different tree compared to not having rowHash called at all (such as in case the row already doesn't exist). As an implication, repair jobs will constantly detect mismatches between older sstables containing purgable rows and nodes that have already compacted these rows. After transfering the reported ranges, the newly created sstables will immediately get deleted again during the following compaction. This will happen for each repair run over again until the sstable with the purgable row finally gets compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8845) sorted CQLSSTableWriter accept unsorted clustering keys
[ https://issues.apache.org/jira/browse/CASSANDRA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-8845: -- Attachment: 8845-2.1.txt There was a change in ArrayBackedSortedColumns which makes sure that the rows are properly sorted when cached. The partition keys still need to be in sorted order, so only the clustering columns can change. Attached is a patch which changes the javadoc to reflect this change. sorted CQLSSTableWriter accept unsorted clustering keys --- Key: CASSANDRA-8845 URL: https://issues.apache.org/jira/browse/CASSANDRA-8845 Project: Cassandra Issue Type: Bug Reporter: Pierre N. Assignee: Carl Yeksigian Fix For: 2.1.4 Attachments: 8845-2.1.txt, TestSorted.java The javadoc says : {quote} The SSTable sorted order means that rows are added such that their partition key respect the partitioner order and for a given partition, that *the rows respect the clustering columns order*. public Builder sorted() {quote} It throw an ex when partition key are in incorrect order, however, it doesn't throw an ex when rows are inserted with incorrect clustering keys order. It buffer them and sort them in correct order. {code} writer.addRow(1, 3); writer.addRow(1, 1); writer.addRow(1, 2); {code} {code} $ sstable2json sorted/ks/t1/ks-t1-ka-1-Data.db [ {key: 1, cells: [[\u\u\u\u0001:,,1424524149557000], [\u\u\u\u0002:,,1424524149557000], [\u\u\u\u0003:,,142452414955]]} ] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive
[ https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reopened CASSANDRA-9033: reopening to close as duplicate Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive --- Key: CASSANDRA-9033 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033 Project: Cassandra Issue Type: Bug Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters] * 12 nodes running a mix of 2.1.1 and 2.1.3 * 8GB stack size with offheap objects Reporter: Brent Haines Assignee: Marcus Eriksson Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip We have an Event Log table using LCS that has grown fast. There are more than 100K sstable files that are around 1KB. Increasing compactors and adjusting compaction throttling upward doesn't make a difference. It has been running great though until we upgraded to 2.1.3. Those nodes needed more RAM for the stack (12 GB) to even have a prayer of responding to queries. They bog down and become unresponsive. There are no GC messages that I can see, and no compaction either. The only work-around I have found is to decommission, blow away the big CF and rejoin. That happens in about 20 minutes and everything is freaking happy again. The size of the files is more like what I'd expect as well. Our schema: {code} cqlsh describe columnfamily data.stories CREATE TABLE data.stories ( id timeuuid PRIMARY KEY, action_data timeuuid, action_name text, app_id timeuuid, app_instance_id timeuuid, data maptext, text, objects settimeuuid, time_stamp timestamp, user_id timeuuid ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Stories represent the timeline and are placed in the dashboard for the brand manager to see' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh {code} There were no log entries that stood out. It pretty much consisted of x is down x is up repeated ad infinitum. I have attached the zipped system.log that has the situation after the upgrade and then after I stopped, removed system, system_traces, OpsCenter, and data/stories-/* and restarted. It has rejoined the cluster now and is busy read-repairing to recover its data. On another note, we see a lot of this during repair now (on all the nodes): {code} ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) ~[apache-cassandra-2.1.3.jar:2.1.3] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55] ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 CassandraDaemon.java:167 - Exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Failed during snapshot creation. at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_55] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_55] at
[jira] [Assigned] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson reassigned CASSANDRA-9045: -- Assignee: Philip Thompson (was: Yuki Morishita) Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Philip Thompson Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8993) EffectiveIndexInterval calculation is incorrect
[ https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382302#comment-14382302 ] Tyler Hobbs commented on CASSANDRA-8993: I'll try to explain a bit about how downsampling works overall so that more people besides myself understand how it works :) I can put whatever info is useful into comments for posterity. bq. If I print out the original indices and effective intervals, it seems that at the first downsampling level (64) The sampling level after minimal downsampling is 127, not 64. The sampling level can be anywhere between 0 and BASE_SAMPLING_LEVEL. When a summary moves from sampling level 128 to level 127, it will drop one summary entry with an index between \[0, 127\], one entry between \[127, 255\], and so on for the rest of the summary. The index to drop is determined by {{Downsampling.getSamplingPattern()}}. The list of integers returned from {{Downsampling.getSamplingPattern(BASE_SAMPLING_LEVEL)}} are the indexes that we'll drop for each round of downsampling. As an example, suppose BASE_SAMPLING_LEVEL is 16 instead of 128. {{Downsampling.getSamplingPattern(16)}} returns the following pattern: {noformat} 15, 7, 11, 3, 13, 5, 9, 1, 14, 6, 10, 2, 12, 4, 8, 0 {noformat} So, when we move from sampling level 16 to 15, we'll drop the entry at index 15 (and repeat that for indexes 15 + (16 * 1), 15 + (16 * 2), 15 + (16 * 3), etc). When we move from sampling level 15 to 14, we'll drop the entry at index 7 (and repeat as before, but take into account the fact that we've already dropped the entry at index 15). This pattern of dropping minimizes the maximum distance between remaining summary entries. Now, in practice, we will never move from sampling level 128 directly to level 127 because of IndexSummaryManager's {{DOWNSAMPLE_THRESHOLD}}. However, an index summary could go through multiple rounds of down and upsampling and arrive at level 127, so we need to be able to handle that. bq. Further confusion to understanding Downsampling as a whole stems from the permission of a -1 index into getEffectiveIndexIntervalAfterIndex without explanation Hmm, yeah, looking at the code, I don't think we actually need to handle that. I believe it is leftover logic from earlier in the development of the code when downsampling would remove the 0th index in an earlier round. With the current code, the 0th index entry should always be present. I'll make some changes to remove that. bq. and the fact that every effective interval is the same despite there being multiple avenues for calculating it I'm not sure what you mean here. EffectiveIndexInterval calculation is incorrect --- Key: CASSANDRA-8993 URL: https://issues.apache.org/jira/browse/CASSANDRA-8993 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Blocker Fix For: 2.1.4 Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt I'm not familiar enough with the calculation itself to understand why this is happening, but see discussion on CASSANDRA-8851 for the background. I've introduced a test case to look for this during downsampling, but it seems to pass just fine, so it may be an artefact of upgrading. The problem was, unfortunately, not manifesting directly because it would simply result in a failed lookup. This was only exposed when early opening used firstKeyBeyond, which does not use the effective interval, and provided the result to getPosition(). I propose a simple fix that ensures a bug here cannot break correctness. Perhaps [~thobbs] can follow up with an investigation as to how it actually went wrong? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9046) Allow Cassandra config to be updated to allow restarting without unloading classes
Emmanuel Hugonnet created CASSANDRA-9046: Summary: Allow Cassandra config to be updated to allow restarting without unloading classes Key: CASSANDRA-9046 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Emmanuel Hugonnet Make applyConfig public in DatabaseDescriptor so that if we embed C* we can restart it after some configuration change without having to stop the whole application to unload the class which is configured once and for all in a static block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382301#comment-14382301 ] Roman Tkachenko commented on CASSANDRA-9045: Repairs are definitely within gc_grace which is 10 days. A repair of a single node (nodetool repair blackbook bounce) takes about 1.5 hours. Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Yuki Morishita Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382330#comment-14382330 ] Roman Tkachenko commented on CASSANDRA-9045: I'll run the test and try to get them to you. Not so sure about the logs though. I've enabled DEBUG and the node hasn't finished starting yet but has already produced ~1GB of logs. If you know how to enable debug mode just for repair/compaction components, let me know. Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Philip Thompson Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to restart Deaemon without unloading classes
[ https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Hugonnet updated CASSANDRA-9046: - Summary: Allow Cassandra config to be updated to restart Deaemon without unloading classes (was: Allow Cassandra config to be updated to allow restarting without unloading classes) Allow Cassandra config to be updated to restart Deaemon without unloading classes - Key: CASSANDRA-9046 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Emmanuel Hugonnet Fix For: 3.0 Attachments: 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch Make applyConfig public in DatabaseDescriptor so that if we embed C* we can restart it after some configuration change without having to stop the whole application to unload the class which is configured once and for all in a static block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows
[ https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-8979: -- Attachment: (was: cassandra-2.0-8979-test.txt) MerkleTree mismatch for deleted and non-existing rows - Key: CASSANDRA-8979 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979 Project: Cassandra Issue Type: Bug Components: Core Reporter: Stefan Podkowinski Assignee: Yuki Morishita Attachments: cassandra-2.0-8979-lazyrow_patch.txt, cassandra-2.0-8979-validator_patch.txt, cassandra-2.0-8979-validatortest_patch.txt, cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt Validation compaction will currently create different hashes for rows that have been deleted compared to nodes that have not seen the rows at all or have already compacted them away. In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to prevent hashing of expired tombstones. This still seems to be in place, but does not address the issue completely. Or there was a change in 2.0 that rendered the patch ineffective. The problem is that rowHash() in the Validator will return a new hash in any case, whether the PrecompactedRow did actually update the digest or not. This will lead to the case that a purged, PrecompactedRow will not change the digest, but we end up with a different tree compared to not having rowHash called at all (such as in case the row already doesn't exist). As an implication, repair jobs will constantly detect mismatches between older sstables containing purgable rows and nodes that have already compacted these rows. After transfering the reported ranges, the newly created sstables will immediately get deleted again during the following compaction. This will happen for each repair run over again until the sstable with the purgable row finally gets compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382231#comment-14382231 ] T Jake Luciani commented on CASSANDRA-8085: --- Technically it was [~slebresne] mine were just bumps from releases Make PasswordAuthenticator number of hashing rounds configurable Key: CASSANDRA-8085 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Sam Tunnicliffe Fix For: 3.0, 2.1.4 Attachments: 8085-2.1.txt, 8085-3.0.txt Running 2^10 rounds of bcrypt can take a while. In environments (like PHP) where connections are not typically long-lived, authenticating can add substantial overhead. On IRC, one user saw the time to connect, authenticate, and execute a query jump from 5ms to 150ms with authentication enabled ([debug logs|http://pastebin.com/bSUufbr0]). CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even after 7715), this is a good option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9045: --- Reproduced In: 2.0.13, 2.0.10 (was: 2.0.10, 2.0.13) Fix Version/s: 2.0.14 Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to allow restarting without unloading classes
[ https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Hugonnet updated CASSANDRA-9046: - Attachment: 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch Allow Cassandra config to be updated to allow restarting without unloading classes -- Key: CASSANDRA-9046 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Emmanuel Hugonnet Attachments: 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch Make applyConfig public in DatabaseDescriptor so that if we embed C* we can restart it after some configuration change without having to stop the whole application to unload the class which is configured once and for all in a static block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive
[ https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382150#comment-14382150 ] Brent Haines commented on CASSANDRA-9033: - Holy shit. You're right. I apologize. Here is what happened - The stories table *was* LCS before we failed last week. When I built the schema for the replacement cluster, I stuck with STCS because the original failure made me nervous. It was stuck in my head that this was an LCS table so I didn't actually review the results of the describe columnfamily. Or look for STCS related bugs... I'm stupid, sorry. I will try the workaround, thanks for that. The initial failure was using LCS (I swear it), but the replacement cluster obviously failed for the reasons you gave. I'll set up the work-around you gave and wait for 2.1.4. One final question - Is it generally ok to change compaction strategies on a large table? Before we restarted this, I tried to change from LCS to STCS and the key store was corrupted. Thanks for the help. Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive --- Key: CASSANDRA-9033 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033 Project: Cassandra Issue Type: Bug Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters] * 12 nodes running a mix of 2.1.1 and 2.1.3 * 8GB stack size with offheap objects Reporter: Brent Haines Assignee: Marcus Eriksson Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip We have an Event Log table using LCS that has grown fast. There are more than 100K sstable files that are around 1KB. Increasing compactors and adjusting compaction throttling upward doesn't make a difference. It has been running great though until we upgraded to 2.1.3. Those nodes needed more RAM for the stack (12 GB) to even have a prayer of responding to queries. They bog down and become unresponsive. There are no GC messages that I can see, and no compaction either. The only work-around I have found is to decommission, blow away the big CF and rejoin. That happens in about 20 minutes and everything is freaking happy again. The size of the files is more like what I'd expect as well. Our schema: {code} cqlsh describe columnfamily data.stories CREATE TABLE data.stories ( id timeuuid PRIMARY KEY, action_data timeuuid, action_name text, app_id timeuuid, app_instance_id timeuuid, data maptext, text, objects settimeuuid, time_stamp timestamp, user_id timeuuid ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Stories represent the timeline and are placed in the dashboard for the brand manager to see' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh {code} There were no log entries that stood out. It pretty much consisted of x is down x is up repeated ad infinitum. I have attached the zipped system.log that has the situation after the upgrade and then after I stopped, removed system, system_traces, OpsCenter, and data/stories-/* and restarted. It has rejoined the cluster now and is busy read-repairing to recover its data. On another note, we see a lot of this during repair now (on all the nodes): {code} ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) ~[apache-cassandra-2.1.3.jar:2.1.3] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[jira] [Commented] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382242#comment-14382242 ] Sylvain Lebresne commented on CASSANDRA-8085: - Almost surely due to a release bump on my part too. This is why we should only set a single fix version before commit (and the committer can feel free to update that to whatever he committed to once he resolve the ticket), as otherwise there is no simple way to bump versions simply and that is what happen. TL;DR, the removal of 2.0 of the fix version was an accident. Make PasswordAuthenticator number of hashing rounds configurable Key: CASSANDRA-8085 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Sam Tunnicliffe Fix For: 3.0, 2.1.4 Attachments: 8085-2.1.txt, 8085-3.0.txt Running 2^10 rounds of bcrypt can take a while. In environments (like PHP) where connections are not typically long-lived, authenticating can add substantial overhead. On IRC, one user saw the time to connect, authenticate, and execute a query jump from 5ms to 150ms with authentication enabled ([debug logs|http://pastebin.com/bSUufbr0]). CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even after 7715), this is a good option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8989) Reading from table which contains collection type using token function and with CL ONE causes overwhelming writes to replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-8989: -- Attachment: 8989-2.0.txt Backported the patch from CASSANDRA-6863 for 2.0. I tested with a mixed 2.0.12 and a patched cluster and there are no additional read repair requests; the 2.0.12 node continues to exhibit this behavior. Reading from table which contains collection type using token function and with CL ONE causes overwhelming writes to replicas --- Key: CASSANDRA-8989 URL: https://issues.apache.org/jira/browse/CASSANDRA-8989 Project: Cassandra Issue Type: Bug Components: Core Reporter: Miroslaw Partyka Assignee: Carl Yeksigian Priority: Critical Attachments: 8989-2.0.txt, trace.txt When reading from a table at the aforementioned conditions, each read from replica also casues write to the replica. Confimed in version 2.0.12 2.0.13, version 2.1.3 seems ok. To reproduce: {code}CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 2}; USE test; CREATE TABLE bug(id int PRIMARY KEY, val mapint,int); INSERT INTO bug(id, val) VALUES (1, {2: 3}); CONSISTENCY LOCAL_QUORUM TRACING ON SELECT * FROM bug WHERE token(id) = 0;{code} trace contains twice: Appending to commitlog Adding to bug memtable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382531#comment-14382531 ] Roman Tkachenko commented on CASSANDRA-9045: I have attached an excerpt from cqlsh session showing select - delete - select - repair - select with tracing on. The very last select was issued after repair was done. Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.14 Attachments: cqlsh.txt Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage
[ https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382524#comment-14382524 ] Ariel Weisberg edited comment on CASSANDRA-8670 at 3/26/15 7:52 PM: NIODataInputStream bq. readNext() should assert it is never shuffling more than 7 bytes; in fact ideally this would be done by readMinimum() to make it clearer By assert you mean an assert that compiles out or a precondition? bq. readNext() should IMO never shuffle unless it's at the end of its capacity; if it hasRemaining() and limit() != capacity() it should read on from its current limit (readMinimum can ensure there is room to fully meet its requirements) I guess I don't get when this optimization will help. I could see it hurting. You could stream through the buffer not returning to the beginning on a regular basis and end up issuing smaller then desired reads. Users of buffered input stream get this behavior and I didn't want to change it. DataInput and company pull bytes out one at a time even for multi-byte types. NIODataOutputStreamPlus bq. available() should return the bytes in the buffer at least I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, for a file it returns the bytes remaining to read from the file. I think it makes sense for the API when you don't have a real answer. bq. why the use of hollowBuffer? For clarity in case of restoring the cursor position during exceptions? Would be helpful to clarify with a comment. It seems like perhaps this should only be used for the first branch, though, since the second should have no risk of throwing an exception, so we can safely restore the position. It seems like it might be best to make hollowBuffer default to null, and instantiate it only if it is larger than our buffer size, otherwise first flushing our internal buffer if we haven't got enough room. This way we should rarely need the hollowBuffer. The contract of the API requires that the incoming buffer not be modified. For thread safety reasons I don't modify the original buffer's position and then reset it in a finally block. I am not sure what you mean by hollow buffer larger then our buffer. It's hollow so it has no size. We also use it copy things into our buffer while preserving the original position. The rest is reasonable. was (Author: aweisberg): NIODataInputStream bq. readNext() should assert it is never shuffling more than 7 bytes; in fact ideally this would be done by readMinimum() to make it clearer By assert you mean an assert that compiles out or a precondition? bq. readNext() should IMO never shuffle unless it's at the end of its capacity; if it hasRemaining() and limit() != capacity() it should read on from its current limit (readMinimum can ensure there is room to fully meet its requirements) I guess I don't get when this optimization will help. I could see it hurting. You could stream through the buffer not returning to the beginning on a regular basis and end up issuing smaller then desired reads. NIODataOutputStreamPlus bq. available() should return the bytes in the buffer at least I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, for a file it returns the bytes remaining to read from the file. I think it makes sense for the API when you don't have a real answer. bq. why the use of hollowBuffer? For clarity in case of restoring the cursor position during exceptions? Would be helpful to clarify with a comment. It seems like perhaps this should only be used for the first branch, though, since the second should have no risk of throwing an exception, so we can safely restore the position. It seems like it might be best to make hollowBuffer default to null, and instantiate it only if it is larger than our buffer size, otherwise first flushing our internal buffer if we haven't got enough room. This way we should rarely need the hollowBuffer. The contract of the API requires that the incoming buffer not be modified. For thread safety reasons I don't modify the original buffer's position and then reset it in a finally block. I am not sure what you mean by hollow buffer larger then our buffer. It's hollow so it has no size. We also use it copy things into our buffer while preserving the original position. The rest is reasonable. Large columns + NIO memory pooling causes excessive direct memory usage --- Key: CASSANDRA-8670 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: largecolumn_test.py If you provide a large byte array to NIO
[jira] [Comment Edited] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage
[ https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382524#comment-14382524 ] Ariel Weisberg edited comment on CASSANDRA-8670 at 3/26/15 8:02 PM: NIODataInputStream bq. readNext() should assert it is never shuffling more than 7 bytes; in fact ideally this would be done by readMinimum() to make it clearer By assert you mean an assert that compiles out or a precondition? bq. readNext() should IMO never shuffle unless it's at the end of its capacity; if it hasRemaining() and limit() != capacity() it should read on from its current limit (readMinimum can ensure there is room to fully meet its requirements) I guess I don't get when this optimization will help. I could see it hurting. You could stream through the buffer not returning to the beginning on a regular basis and end up issuing smaller then desired reads. Users of buffered input stream get this behavior and I didn't want to change it. DataInput and company pull bytes out one at a time even for multi-byte types. NIODataOutputStreamPlus bq. available() should return the bytes in the buffer at least I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, for a file it returns the bytes remaining to read from the file. I think it makes sense for the API when you don't have a real answer. bq. why the use of hollowBuffer? For clarity in case of restoring the cursor position during exceptions? Would be helpful to clarify with a comment. It seems like perhaps this should only be used for the first branch, though, since the second should have no risk of throwing an exception, so we can safely restore the position. It seems like it might be best to make hollowBuffer default to null, and instantiate it only if it is larger than our buffer size, otherwise first flushing our internal buffer if we haven't got enough room. This way we should rarely need the hollowBuffer. The contract of the API requires that the incoming buffer not be modified. For thread safety reasons I don't modify the original buffer's position and then reset it in a finally block. I am not sure what you mean by hollow buffer larger than our buffer. It's hollow so it has no size. We also use it copy things into our buffer while preserving the original position. The rest is reasonable. was (Author: aweisberg): NIODataInputStream bq. readNext() should assert it is never shuffling more than 7 bytes; in fact ideally this would be done by readMinimum() to make it clearer By assert you mean an assert that compiles out or a precondition? bq. readNext() should IMO never shuffle unless it's at the end of its capacity; if it hasRemaining() and limit() != capacity() it should read on from its current limit (readMinimum can ensure there is room to fully meet its requirements) I guess I don't get when this optimization will help. I could see it hurting. You could stream through the buffer not returning to the beginning on a regular basis and end up issuing smaller then desired reads. Users of buffered input stream get this behavior and I didn't want to change it. DataInput and company pull bytes out one at a time even for multi-byte types. NIODataOutputStreamPlus bq. available() should return the bytes in the buffer at least I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, for a file it returns the bytes remaining to read from the file. I think it makes sense for the API when you don't have a real answer. bq. why the use of hollowBuffer? For clarity in case of restoring the cursor position during exceptions? Would be helpful to clarify with a comment. It seems like perhaps this should only be used for the first branch, though, since the second should have no risk of throwing an exception, so we can safely restore the position. It seems like it might be best to make hollowBuffer default to null, and instantiate it only if it is larger than our buffer size, otherwise first flushing our internal buffer if we haven't got enough room. This way we should rarely need the hollowBuffer. The contract of the API requires that the incoming buffer not be modified. For thread safety reasons I don't modify the original buffer's position and then reset it in a finally block. I am not sure what you mean by hollow buffer larger then our buffer. It's hollow so it has no size. We also use it copy things into our buffer while preserving the original position. The rest is reasonable. Large columns + NIO memory pooling causes excessive direct memory usage --- Key: CASSANDRA-8670 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670 Project: Cassandra Issue Type: Bug Components: Core Reporter:
[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382579#comment-14382579 ] Carl Yeksigian commented on CASSANDRA-9048: --- While I think this is really useful, I don't see why this would live in-tree, especially given part of the worry here is that cqlsh only works with a single version of Cassandra at a time -- I would imagine this would live in much the same way. Since it doesn't utilize anything in-tree, it would make sense to keep this as a separate repository. Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Fix For: 3.0 Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382617#comment-14382617 ] Aleksey Yeschenko commented on CASSANDRA-9048: -- We already have plans for a Spark-based, multiple-format data import/export tool. CSV files will be the first supported format, with other Cassandra tables supported too (see CASSANDRA-8234). That tool, once done, will go in the tree, and supersede CQLSH's COPY, among other things. Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Fix For: 3.0 Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382507#comment-14382507 ] Philip Thompson commented on CASSANDRA-9048: The comment in StringParser needs fixed, it does not reflect what the method does. You don't follow code style everywhere [1]. [1] http://wiki.apache.org/cassandra/CodeStyle Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Fix For: 3.0 Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Tkachenko updated CASSANDRA-9045: --- Attachment: cqlsh.txt Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.14 Attachments: cqlsh.txt Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage
[ https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382524#comment-14382524 ] Ariel Weisberg commented on CASSANDRA-8670: --- NIODataInputStream bq. readNext() should assert it is never shuffling more than 7 bytes; in fact ideally this would be done by readMinimum() to make it clearer By assert you mean an assert that compiles out or a precondition? bq. readNext() should IMO never shuffle unless it's at the end of its capacity; if it hasRemaining() and limit() != capacity() it should read on from its current limit (readMinimum can ensure there is room to fully meet its requirements) I guess I don't get when this optimization will help. I could see it hurting. You could stream through the buffer not returning to the beginning on a regular basis and end up issuing smaller then desired reads. NIODataOutputStreamPlus bq. available() should return the bytes in the buffer at least I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, for a file it returns the bytes remaining to read from the file. I think it makes sense for the API when you don't have a real answer. bq. why the use of hollowBuffer? For clarity in case of restoring the cursor position during exceptions? Would be helpful to clarify with a comment. It seems like perhaps this should only be used for the first branch, though, since the second should have no risk of throwing an exception, so we can safely restore the position. It seems like it might be best to make hollowBuffer default to null, and instantiate it only if it is larger than our buffer size, otherwise first flushing our internal buffer if we haven't got enough room. This way we should rarely need the hollowBuffer. The contract of the API requires that the incoming buffer not be modified. For thread safety reasons I don't modify the original buffer's position and then reset it in a finally block. I am not sure what you mean by hollow buffer larger then our buffer. It's hollow so it has no size. We also use it copy things into our buffer while preserving the original position. The rest is reasonable. Large columns + NIO memory pooling causes excessive direct memory usage --- Key: CASSANDRA-8670 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: largecolumn_test.py If you provide a large byte array to NIO and ask it to populate the byte array from a socket it will allocate a thread local byte buffer that is the size of the requested read no matter how large it is. Old IO wraps new IO for sockets (but not files) so old IO is effected as well. Even If you are using Buffered{Input | Output}Stream you can end up passing a large byte array to NIO. The byte array read method will pass the array to NIO directly if it is larger than the internal buffer. Passing large cells between nodes as part of intra-cluster messaging can cause the NIO pooled buffers to quickly reach a high watermark and stay there. This ends up costing 2x the largest cell size because there is a buffer for input and output since they are different threads. This is further multiplied by the number of nodes in the cluster - 1 since each has a dedicated thread pair with separate thread locals. Anecdotally it appears that the cost is doubled beyond that although it isn't clear why. Possibly the control connections or possibly there is some way in which multiple Need a workload in CI that tests the advertised limits of cells on a cluster. It would be reasonable to ratchet down the max direct memory for the test to trigger failures if a memory pooling issue is introduced. I don't think we need to test concurrently pulling in a lot of them, but it should at least work serially. The obvious fix to address this issue would be to read in smaller chunks when dealing with large values. I think small should still be relatively large (4 megabytes) so that code that is reading from a disk can amortize the cost of a seek. It can be hard to tell what the underlying thing being read from is going to be in some of the contexts where we might choose to implement switching to reading chunks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382534#comment-14382534 ] Roman Tkachenko commented on CASSANDRA-9045: Forgot to mention that before the test I restored the original in memory compaction limit to the default 64MB so the row does not fit into this limit. Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.14 Attachments: cqlsh.txt Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382532#comment-14382532 ] Philip Thompson commented on CASSANDRA-9045: [~thobbs], this will be most meaningful to you. The Digest Mismatch seems interesting to me, how could that happen at CL=ALL for all operations? Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.14 Attachments: cqlsh.txt Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9045: --- Reproduced In: 2.0.13, 2.0.10 (was: 2.0.10, 2.0.13) Tester: Philip Thompson Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] cassandra git commit: Backport CASSANDRA-8085 to cassandra-2.0
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 14327e4b9 - 93156d761 Backport CASSANDRA-8085 to cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1b1acae9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1b1acae9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1b1acae9 Branch: refs/heads/cassandra-2.1 Commit: 1b1acae9afbd6faf2f628d5ae7ba0763aaac1e86 Parents: 9625910 Author: Tyler Hobbs ty...@datastax.com Authored: Thu Mar 26 13:22:09 2015 -0500 Committer: Tyler Hobbs ty...@datastax.com Committed: Thu Mar 26 13:22:09 2015 -0500 -- CHANGES.txt | 1 + .../apache/cassandra/auth/PasswordAuthenticator.java | 15 +-- 2 files changed, 14 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 293dc55..adc0d59 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.14: + * Make PasswordAuthenticator number of hashing rounds configurable (CASSANDRA-8085) * Lower logging level from ERROR to DEBUG when a scheduled schema pull cannot be completed due to a node being down (CASSANDRA-9032) * Fix MOVED_NODE client event (CASSANDRA-8516) http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index e4c00b7..3c6d1af 100644 --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@ -53,8 +53,19 @@ public class PasswordAuthenticator implements ISaslAwareAuthenticator { private static final Logger logger = LoggerFactory.getLogger(PasswordAuthenticator.class); -// 2 ** GENSALT_LOG2_ROUNS rounds of hashing will be performed. -private static final int GENSALT_LOG2_ROUNDS = 10; +// 2 ** GENSALT_LOG2_ROUNDS rounds of hashing will be performed. +private static final String GENSALT_LOG2_ROUNDS_PROPERTY = cassandra.auth_bcrypt_gensalt_log2_rounds; +private static final int GENSALT_LOG2_ROUNDS = getGensaltLogRounds(); + +static int getGensaltLogRounds() +{ +int rounds = Integer.getInteger(GENSALT_LOG2_ROUNDS_PROPERTY, 10); +if (rounds 4 || rounds 31) +throw new RuntimeException(new ConfigurationException(String.format(Bad value for system property -D%s. + + Please use a value 4 and 31, + GENSALT_LOG2_ROUNDS_PROPERTY))); +return rounds; +} // name of the hash column. private static final String SALTED_HASH = salted_hash;
cassandra git commit: Backport CASSANDRA-8085 to cassandra-2.0
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 9625910a5 - 1b1acae9a Backport CASSANDRA-8085 to cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1b1acae9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1b1acae9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1b1acae9 Branch: refs/heads/cassandra-2.0 Commit: 1b1acae9afbd6faf2f628d5ae7ba0763aaac1e86 Parents: 9625910 Author: Tyler Hobbs ty...@datastax.com Authored: Thu Mar 26 13:22:09 2015 -0500 Committer: Tyler Hobbs ty...@datastax.com Committed: Thu Mar 26 13:22:09 2015 -0500 -- CHANGES.txt | 1 + .../apache/cassandra/auth/PasswordAuthenticator.java | 15 +-- 2 files changed, 14 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 293dc55..adc0d59 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.14: + * Make PasswordAuthenticator number of hashing rounds configurable (CASSANDRA-8085) * Lower logging level from ERROR to DEBUG when a scheduled schema pull cannot be completed due to a node being down (CASSANDRA-9032) * Fix MOVED_NODE client event (CASSANDRA-8516) http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java -- diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java index e4c00b7..3c6d1af 100644 --- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java +++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java @@ -53,8 +53,19 @@ public class PasswordAuthenticator implements ISaslAwareAuthenticator { private static final Logger logger = LoggerFactory.getLogger(PasswordAuthenticator.class); -// 2 ** GENSALT_LOG2_ROUNS rounds of hashing will be performed. -private static final int GENSALT_LOG2_ROUNDS = 10; +// 2 ** GENSALT_LOG2_ROUNDS rounds of hashing will be performed. +private static final String GENSALT_LOG2_ROUNDS_PROPERTY = cassandra.auth_bcrypt_gensalt_log2_rounds; +private static final int GENSALT_LOG2_ROUNDS = getGensaltLogRounds(); + +static int getGensaltLogRounds() +{ +int rounds = Integer.getInteger(GENSALT_LOG2_ROUNDS_PROPERTY, 10); +if (rounds 4 || rounds 31) +throw new RuntimeException(new ConfigurationException(String.format(Bad value for system property -D%s. + + Please use a value 4 and 31, + GENSALT_LOG2_ROUNDS_PROPERTY))); +return rounds; +} // name of the hash column. private static final String SALTED_HASH = salted_hash;
[jira] [Commented] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive
[ https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382416#comment-14382416 ] Brent Haines commented on CASSANDRA-9033: - I did not maintain logs for the corruption while changing compaction strategies. I imagine it might have been caused by the many millions of sstable files. I am relieved to see them all shrink to just a handful within 30 minutes of running with the settings you prescribed. Thank you. Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive --- Key: CASSANDRA-9033 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033 Project: Cassandra Issue Type: Bug Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters] * 12 nodes running a mix of 2.1.1 and 2.1.3 * 8GB stack size with offheap objects Reporter: Brent Haines Assignee: Marcus Eriksson Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip We have an Event Log table using LCS that has grown fast. There are more than 100K sstable files that are around 1KB. Increasing compactors and adjusting compaction throttling upward doesn't make a difference. It has been running great though until we upgraded to 2.1.3. Those nodes needed more RAM for the stack (12 GB) to even have a prayer of responding to queries. They bog down and become unresponsive. There are no GC messages that I can see, and no compaction either. The only work-around I have found is to decommission, blow away the big CF and rejoin. That happens in about 20 minutes and everything is freaking happy again. The size of the files is more like what I'd expect as well. Our schema: {code} cqlsh describe columnfamily data.stories CREATE TABLE data.stories ( id timeuuid PRIMARY KEY, action_data timeuuid, action_name text, app_id timeuuid, app_instance_id timeuuid, data maptext, text, objects settimeuuid, time_stamp timestamp, user_id timeuuid ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Stories represent the timeline and are placed in the dashboard for the brand manager to see' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh {code} There were no log entries that stood out. It pretty much consisted of x is down x is up repeated ad infinitum. I have attached the zipped system.log that has the situation after the upgrade and then after I stopped, removed system, system_traces, OpsCenter, and data/stories-/* and restarted. It has rejoined the cluster now and is busy read-repairing to recover its data. On another note, we see a lot of this during repair now (on all the nodes): {code} ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) ~[apache-cassandra-2.1.3.jar:2.1.3] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) ~[guava-16.0.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55] ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 CassandraDaemon.java:167 - Exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Failed during snapshot creation. at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.3.jar:2.1.3] at
[jira] [Updated] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder
[ https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-9034: -- Attachment: 9034-trunk.txt Was able to replicate by starting with {{-Dcassandra.join_ring=false}}. Added a check to SizeEstimatesRecorder to make sure StorageService has started. AssertionError in SizeEstimatesRecorder --- Key: CASSANDRA-9034 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034 Project: Cassandra Issue Type: Bug Environment: Trunk (52ddfe412a) Reporter: Stefania Priority: Minor Fix For: 3.0 Attachments: 9034-trunk.txt One of the dtests of CASSANDRA-8236 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following exception unless I set {{-Dcassandra.size_recorder_interval=0}}: {code} ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - Exception in thread Thread[OptionalTasks:1,5,main] java.lang.AssertionError: null at org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235) ~[main/:na] at org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61) ~[main/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_76] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_76] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76] INFO [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 StorageService.java:863 - Joining ring by operator request {code} The test is {{start_node_without_join_test}} in _pushed_notifications_test.py_ but starting a node that won't join the ring might be sufficient to reproduce the exception (I haven't tried though). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382165#comment-14382165 ] Tyler Hobbs commented on CASSANDRA-8085: Is there a reason we shouldn't backport this to 2.0? It looks like [~tjake] set the fixver to 2.1 -- any particular reason for doing that? Make PasswordAuthenticator number of hashing rounds configurable Key: CASSANDRA-8085 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Sam Tunnicliffe Fix For: 3.0, 2.1.4 Attachments: 8085-2.1.txt, 8085-3.0.txt Running 2^10 rounds of bcrypt can take a while. In environments (like PHP) where connections are not typically long-lived, authenticating can add substantial overhead. On IRC, one user saw the time to connect, authenticate, and execute a query jump from 5ms to 150ms with authentication enabled ([debug logs|http://pastebin.com/bSUufbr0]). CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even after 7715), this is a good option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state
[ https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382439#comment-14382439 ] Jonathan Ellis commented on CASSANDRA-8984: --- bq. our release page doesn't quite agree with this implicit assertion (that 2.1 is stable) It aspires to be stable. :) Let's keep the big changes to 3.x now. Introduce Transactional API for behaviours that can corrupt system state Key: CASSANDRA-8984 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.4 Attachments: 8984_windows_timeout.txt As a penultimate (and probably final for 2.1, if we agree to introduce it there) round of changes to the internals managing sstable writing, I've introduced a new API called Transactional that I hope will make it much easier to write correct behaviour. As things stand we conflate a lot of behaviours into methods like close - the recent changes unpicked some of these, but didn't go far enough. My proposal here introduces an interface designed to support four actions (on top of their normal function): * prepareToCommit * commit * abort * cleanup In normal operation, once we have finished constructing a state change we call prepareToCommit; once all such state changes are prepared, we call commit. If at any point everything fails, abort is called. In _either_ case, cleanup is called at the very last. These transactional objects are all AutoCloseable, with the behaviour being to rollback any changes unless commit has completed successfully. The changes are actually less invasive than it might sound, since we did recently introduce abort in some places, as well as have commit like methods. This simply formalises the behaviour, and makes it consistent between all objects that interact in this way. Much of the code change is boilerplate, such as moving an object into a try-declaration, although the change is still non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we have had since 2.1 was released. The data tracker API changes and compaction leftover cleanups should finish the job with making this much easier to reason about, but this change I think is worthwhile considering for 2.1, since we've just overhauled this entire area (and not released these changes), and this change is essentially just the finishing touches, so the risk is minimal and the potential gains reasonably significant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382474#comment-14382474 ] Brian Hess commented on CASSANDRA-9048: I have created a version of this as a Java program via executeAsync(). Some testing has shown that for bulk writing to Cassandra, if you are starting with delimited files (not SSTables), that Java's executeAsync() is more efficient/performant than creating SSTables and then calling sstableloader. This implementation provides for the options above, as well as a way to specify the parallelism of the asynchronous writing (the number of futures in flight). In addition to the Java implementation, I created a command-line utility a la cassandra-stress called cassandra-loader to invoke the Java classes with the appropriate CLASSPATH. As such, I also modified build.xml and tools/bin/cassandra.in.sh as appropriate. The patch is attached for review. The command-line usage statement is: {{Usage: -f filename -host ipaddress -schema schema [OPTIONS] OPTIONS: -delim delimiter Delimiter to use [,] -delmInQuotes true Set to 'true' if delimiter can be inside quoted fields [false] -dateFormat dateFormatString Date format [default for Locale.ENGLISH] -nullString nullString String that signifies NULL [none] -skipRows skipRows Number of rows to skip [0] -maxRows maxRows Maximum number of rows to read (-1 means all) [-1] -maxErrors maxErrors Maximum errors to endure [10] -badFile badFilename Filename for where to place badly parsed rows. [none] -port portNumber CQL Port Number [9042] -numFutures numFutures Number of CQL futures to keep in flight [1000] -decimalDelim decimalDelim Decimal delimiter [.] Other option is ',' -boolStyle boolStyleString Style for booleans [TRUE_FALSE] }} Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to restart Daemon without unloading classes
[ https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9046: --- Reviewer: Ariel Weisberg Allow Cassandra config to be updated to restart Daemon without unloading classes Key: CASSANDRA-9046 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Emmanuel Hugonnet Fix For: 3.0 Attachments: 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch Make applyConfig public in DatabaseDescriptor so that if we embed C* we can restart it after some configuration change without having to stop the whole application to unload the class which is configured once and for all in a static block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-8085: --- Attachment: 8085-2.0.txt Make PasswordAuthenticator number of hashing rounds configurable Key: CASSANDRA-8085 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Sam Tunnicliffe Fix For: 3.0, 2.1.4, 2.0.14 Attachments: 8085-2.0.txt, 8085-2.1.txt, 8085-3.0.txt Running 2^10 rounds of bcrypt can take a while. In environments (like PHP) where connections are not typically long-lived, authenticating can add substantial overhead. On IRC, one user saw the time to connect, authenticate, and execute a query jump from 5ms to 150ms with authentication enabled ([debug logs|http://pastebin.com/bSUufbr0]). CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even after 7715), this is a good option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder
[ https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian reassigned CASSANDRA-9034: - Assignee: Carl Yeksigian AssertionError in SizeEstimatesRecorder --- Key: CASSANDRA-9034 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034 Project: Cassandra Issue Type: Bug Environment: Trunk (52ddfe412a) Reporter: Stefania Assignee: Carl Yeksigian Priority: Minor Fix For: 3.0 Attachments: 9034-trunk.txt One of the dtests of CASSANDRA-8236 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following exception unless I set {{-Dcassandra.size_recorder_interval=0}}: {code} ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - Exception in thread Thread[OptionalTasks:1,5,main] java.lang.AssertionError: null at org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235) ~[main/:na] at org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61) ~[main/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_76] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_76] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76] INFO [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 StorageService.java:863 - Joining ring by operator request {code} The test is {{start_node_without_join_test}} in _pushed_notifications_test.py_ but starting a node that won't join the ring might be sufficient to reproduce the exception (I haven't tried though). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9048) Delimited File Bulk Loader
Brian Hess created CASSANDRA-9048: -- Summary: Delimited File Bulk Loader Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382365#comment-14382365 ] Tyler Hobbs commented on CASSANDRA-9045: It sounds to me like the incremental compaction is not processing range tombstones correctly, and it's purging the tombstone without purging the shadowed data. It also sounds like the range tombstone is being dropped before gc_grace has passed, so something is going pretty wrong. It seems like we should be able to reproduce this with a similar schema and similar deletes on a row that's above the in-memory compaction threshold. Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-8085: --- Fix Version/s: 2.0.14 Okay, I've backported the patch to 2.0 and committed it as {{1b1acae}}. Make PasswordAuthenticator number of hashing rounds configurable Key: CASSANDRA-8085 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Sam Tunnicliffe Fix For: 3.0, 2.1.4, 2.0.14 Attachments: 8085-2.1.txt, 8085-3.0.txt Running 2^10 rounds of bcrypt can take a while. In environments (like PHP) where connections are not typically long-lived, authenticating can add substantial overhead. On IRC, one user saw the time to connect, authenticate, and execute a query jump from 5ms to 150ms with authentication enabled ([debug logs|http://pastebin.com/bSUufbr0]). CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even after 7715), this is a good option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hess updated CASSANDRA-9048: -- Attachment: CASSANDRA-9048.patch Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9045: --- Assignee: Marcus Eriksson (was: Philip Thompson) Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382359#comment-14382359 ] Philip Thompson commented on CASSANDRA-9045: After discussion with [~thobbs], seems like a problem with incremental compaction. Assigning to [~krummas] Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Philip Thompson Priority: Critical Fix For: 2.0.14 Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8669) simple_repair test failing on 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382379#comment-14382379 ] Yuki Morishita commented on CASSANDRA-8669: --- Bisected down to this commit: [871f0039c5bf89be343039478c64ce835b04b5cf|https://github.com/apache/cassandra/commit/871f0039c5bf89be343039478c64ce835b04b5cf] (CASSANDRA-8429) With the one commit before (bedd97f7abea417c0165721888458e62392875e9), I can run {{repair_test.py}} continuously without failure. As Philip commented before, somehow extra ranges are being repaired when test fails. I don't know if this relates to early open compaction since I modified {{repair_test.py}} and set {{sstable_preemptive_open_interval_in_mb: -1}} but test still fails with the same error. Will dig some more. simple_repair test failing on 2.1 - Key: CASSANDRA-8669 URL: https://issues.apache.org/jira/browse/CASSANDRA-8669 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Yuki Morishita Fix For: 2.1.4 The dtest simple_repair_test began failing on 12/22 on 2.1 and trunk. The test fails intermittently both locally and on cassci. The test is here: https://github.com/riptano/cassandra-dtest/blob/master/repair_test.py#L32 The output is here: http://cassci.datastax.com/job/cassandra-2.1_dtest/661/testReport/repair_test/TestRepair/simple_repair_test/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9047) The FROZEN and TUPLE keywords should not be reserved in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-9047: --- Reviewer: Benjamin Lerer The FROZEN and TUPLE keywords should not be reserved in CQL --- Key: CASSANDRA-9047 URL: https://issues.apache.org/jira/browse/CASSANDRA-9047 Project: Cassandra Issue Type: Bug Components: Core Reporter: Tyler Hobbs Assignee: Tyler Hobbs Priority: Trivial Fix For: 2.1.4 Attachments: 9047-2.1.txt It looks like we accidentally forgot to add the FROZEN and TUPLE keywords to the list of unreserved keywords in Cql.g. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9047) The FROZEN and TUPLE keywords should not be reserved in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-9047: --- Attachment: 9047-2.1.txt The FROZEN and TUPLE keywords should not be reserved in CQL --- Key: CASSANDRA-9047 URL: https://issues.apache.org/jira/browse/CASSANDRA-9047 Project: Cassandra Issue Type: Bug Components: Core Reporter: Tyler Hobbs Assignee: Tyler Hobbs Priority: Trivial Fix For: 2.1.4 Attachments: 9047-2.1.txt It looks like we accidentally forgot to add the FROZEN and TUPLE keywords to the list of unreserved keywords in Cql.g. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure
[ https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382492#comment-14382492 ] Benedict commented on CASSANDRA-8499: - Affected actions are: truncate, major compaction, cleanup, scrub, upgrade. Repair looks to be fine. Ensure SSTableWriter cleans up properly after failure - Key: CASSANDRA-8499 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.0.12, 2.1.3 Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2, 8499-21v3 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of offheap memory for writing compression metadata. In both we attempt to flush the BF despite having encountered an exception, making the exception slow to propagate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9048: --- Fix Version/s: 3.0 Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Fix For: 3.0 Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8989) Reading from table which contains collection type using token function and with CL ONE causes overwhelming writes to replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-8989: --- Reviewer: Sam Tunnicliffe Reading from table which contains collection type using token function and with CL ONE causes overwhelming writes to replicas --- Key: CASSANDRA-8989 URL: https://issues.apache.org/jira/browse/CASSANDRA-8989 Project: Cassandra Issue Type: Bug Components: Core Reporter: Miroslaw Partyka Assignee: Carl Yeksigian Priority: Critical Fix For: 2.0.14 Attachments: 8989-2.0.txt, trace.txt When reading from a table at the aforementioned conditions, each read from replica also casues write to the replica. Confimed in version 2.0.12 2.0.13, version 2.1.3 seems ok. To reproduce: {code}CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 2}; USE test; CREATE TABLE bug(id int PRIMARY KEY, val mapint,int); INSERT INTO bug(id, val) VALUES (1, {2: 3}); CONSISTENCY LOCAL_QUORUM TRACING ON SELECT * FROM bug WHERE token(id) = 0;{code} trace contains twice: Appending to commitlog Adding to bug memtable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder
[ https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383026#comment-14383026 ] Stefania commented on CASSANDRA-9034: - Looks good, +1 AssertionError in SizeEstimatesRecorder --- Key: CASSANDRA-9034 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034 Project: Cassandra Issue Type: Bug Environment: Trunk (52ddfe412a) Reporter: Stefania Assignee: Carl Yeksigian Priority: Minor Fix For: 3.0 Attachments: 9034-trunk.txt One of the dtests of CASSANDRA-8236 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following exception unless I set {{-Dcassandra.size_recorder_interval=0}}: {code} ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - Exception in thread Thread[OptionalTasks:1,5,main] java.lang.AssertionError: null at org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235) ~[main/:na] at org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61) ~[main/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_76] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_76] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76] INFO [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 StorageService.java:863 - Joining ring by operator request {code} The test is {{start_node_without_join_test}} in _pushed_notifications_test.py_ but starting a node that won't join the ring might be sufficient to reproduce the exception (I haven't tried though). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder
[ https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383035#comment-14383035 ] Stefania commented on CASSANDRA-9034: - [~iamaleksey] if you are happy could you take care of committing please? AssertionError in SizeEstimatesRecorder --- Key: CASSANDRA-9034 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034 Project: Cassandra Issue Type: Bug Environment: Trunk (52ddfe412a) Reporter: Stefania Assignee: Carl Yeksigian Priority: Minor Fix For: 3.0 Attachments: 9034-trunk.txt One of the dtests of CASSANDRA-8236 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following exception unless I set {{-Dcassandra.size_recorder_interval=0}}: {code} ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - Exception in thread Thread[OptionalTasks:1,5,main] java.lang.AssertionError: null at org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235) ~[main/:na] at org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61) ~[main/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_76] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_76] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_76] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_76] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76] INFO [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 StorageService.java:863 - Joining ring by operator request {code} The test is {{start_node_without_join_test}} in _pushed_notifications_test.py_ but starting a node that won't join the ring might be sufficient to reproduce the exception (I haven't tried though). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383081#comment-14383081 ] Jonathan Ellis commented on CASSANDRA-9048: --- How performant is this compared to CASSANDRA-7405? Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Fix For: 3.0 Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7970) JSON support for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382768#comment-14382768 ] Tyler Hobbs commented on CASSANDRA-7970: I've pushed some new commits to my branch to address your comments. I also merged in the latest trunk and added support for the new date and time types. bq. So, AbstractType.fromJSONObject would return a Term Done (over a few commits). The only hangup was that with collections and tuples, we need to avoid serializing elements in {{fromJSONObject}} because this can happen at prepare-time when we don't know the protocol version. Accordingly, Those classes return a DelayedValue instead of a terminal Value. JSON support for CQL Key: CASSANDRA-7970 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Tyler Hobbs Labels: client-impacting, cql3.3, docs-impacting Fix For: 3.0 Attachments: 7970-trunk-v1.txt JSON is popular enough that not supporting it is becoming a competitive weakness. We can add JSON support in a way that is compatible with our performance goals by *mapping* JSON to an existing schema: one JSON documents maps to one CQL row. Thus, it is NOT a goal to support schemaless documents, which is a misfeature [1] [2] [3]. Rather, it is to allow a convenient way to easily turn a JSON document from a service or a user into a CQL row, with all the validation that entails. Since we are not looking to support schemaless documents, we will not be adding a JSON data type (CASSANDRA-6833) a la postgresql. Rather, we will map the JSON to UDT, collections, and primitive CQL types. Here's how this might look: {code} CREATE TYPE address ( street text, city text, zip_code int, phones settext ); CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses maptext, address ); INSERT INTO users JSON {‘id’: 4b856557-7153, ‘name’: ‘jbellis’, ‘address’: {“home”: {“street”: “123 Cassandra Dr”, “city”: “Austin”, “zip_code”: 78747, “phones”: [2101234567]}}}; SELECT JSON id, address FROM users; {code} (We would also want to_json and from_json functions to allow mapping a single column's worth of data. These would not require extra syntax.) [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/ [2] https://blog.compose.io/schema-less-is-usually-a-lie/ [3] http://dl.acm.org/citation.cfm?id=2481247 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8993) EffectiveIndexInterval calculation is incorrect
[ https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382505#comment-14382505 ] Benedict commented on CASSANDRA-8993: - OK, so it makes a lot more sense now that I realise the downsampling granularity can be so small - I was thrown by the BSL must be a power of two without an equivalent statement for the sampling itself, and in my head I just assumed it was all dealing with powers of 2 (so nothing technically wrong with the comments, just my interpretation of them). This also explains why the effective index intervals were always the same - with powers of 2 they would be. I wonder if we couldn't get a lot of the benefit of downsampling by sticking to powers of 2, as it might simplify the code significantly? The original indices, indices to skip, and effective intervals could each be implemented with approximately one simple statement. Not pushing for it, mind, just airing the question. Thanks for taking the time to explain, anyway, and with that clarification I am +1 the patch as stands. On the topic of the zero index always being present: I can vouch that this assumption breaks somewhere, because I assumed this to be the case when modifying IndexSummaryBuilder, and without a setNextSamplePosition(-minIndexInterval) it doesn't pass its test cases (i.e. initiating the first sample index deterministically to zero caused unit test failures). So we should perhaps track down where the logical flaw is, however minor it may be. EffectiveIndexInterval calculation is incorrect --- Key: CASSANDRA-8993 URL: https://issues.apache.org/jira/browse/CASSANDRA-8993 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Blocker Fix For: 2.1.4 Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt I'm not familiar enough with the calculation itself to understand why this is happening, but see discussion on CASSANDRA-8851 for the background. I've introduced a test case to look for this during downsampling, but it seems to pass just fine, so it may be an artefact of upgrading. The problem was, unfortunately, not manifesting directly because it would simply result in a failed lookup. This was only exposed when early opening used firstKeyBeyond, which does not use the effective interval, and provided the result to getPosition(). I propose a simple fix that ensures a bug here cannot break correctness. Perhaps [~thobbs] can follow up with an investigation as to how it actually went wrong? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9044) Build prototype for validation testing harness
[ https://issues.apache.org/jira/browse/CASSANDRA-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382514#comment-14382514 ] Philip Thompson commented on CASSANDRA-9044: Merged into dtest and running here http://cassci.datastax.com/job/CTOOL_stress_validation/ Long term, this will probably need to move out of dtest, or at least into it's own submodule. Build prototype for validation testing harness -- Key: CASSANDRA-9044 URL: https://issues.apache.org/jira/browse/CASSANDRA-9044 Project: Cassandra Issue Type: Sub-task Reporter: Philip Thompson Assignee: Philip Thompson Build a job and set it to run on jenkins for the basic stress validation described in CASSANDRA-9007. Currently only using CCM nodes and log parsing stress for errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9049) Run validation harness against a real cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9049: --- Issue Type: Sub-task (was: Task) Parent: CASSANDRA-9007 Run validation harness against a real cluster - Key: CASSANDRA-9049 URL: https://issues.apache.org/jira/browse/CASSANDRA-9049 Project: Cassandra Issue Type: Sub-task Reporter: Philip Thompson Assignee: Philip Thompson Currently we run against CCM nodes. We will get more useful data and feedback if we run against real C* clusters, whether on dedicated hardware or provisioned on a cloud. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9049) Run validation harness against a real cluster
Philip Thompson created CASSANDRA-9049: -- Summary: Run validation harness against a real cluster Key: CASSANDRA-9049 URL: https://issues.apache.org/jira/browse/CASSANDRA-9049 Project: Cassandra Issue Type: Task Reporter: Philip Thompson Assignee: Philip Thompson Currently we run against CCM nodes. We will get more useful data and feedback if we run against real C* clusters, whether on dedicated hardware or provisioned on a cloud. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382602#comment-14382602 ] Jeff Jirsa commented on CASSANDRA-9048: --- 2 cents: Agree with [~carlyeks]'s comment: seems useful, but don't see why it would go into the tree. Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Fix For: 3.0 Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7970) JSON support for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-7970: --- Attachment: 7970-trunk-v2.txt JSON support for CQL Key: CASSANDRA-7970 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Tyler Hobbs Labels: client-impacting, cql3.3, docs-impacting Fix For: 3.0 Attachments: 7970-trunk-v1.txt, 7970-trunk-v2.txt JSON is popular enough that not supporting it is becoming a competitive weakness. We can add JSON support in a way that is compatible with our performance goals by *mapping* JSON to an existing schema: one JSON documents maps to one CQL row. Thus, it is NOT a goal to support schemaless documents, which is a misfeature [1] [2] [3]. Rather, it is to allow a convenient way to easily turn a JSON document from a service or a user into a CQL row, with all the validation that entails. Since we are not looking to support schemaless documents, we will not be adding a JSON data type (CASSANDRA-6833) a la postgresql. Rather, we will map the JSON to UDT, collections, and primitive CQL types. Here's how this might look: {code} CREATE TYPE address ( street text, city text, zip_code int, phones settext ); CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses maptext, address ); INSERT INTO users JSON {‘id’: 4b856557-7153, ‘name’: ‘jbellis’, ‘address’: {“home”: {“street”: “123 Cassandra Dr”, “city”: “Austin”, “zip_code”: 78747, “phones”: [2101234567]}}}; SELECT JSON id, address FROM users; {code} (We would also want to_json and from_json functions to allow mapping a single column's worth of data. These would not require extra syntax.) [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/ [2] https://blog.compose.io/schema-less-is-usually-a-lie/ [3] http://dl.acm.org/citation.cfm?id=2481247 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7807) Push notification when tracing completes for an operation
[ https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-7807: Attachment: 7807-v2.txt I’ve added functionality to {{SimpleClient}} to specify the requested protocol version (also added support for that to {{Client}}/{{debug-cql}} tool). The code now * checks for protocol version 4 (added utest) * checks if the connection registered for the event (added negative utest) * ensure that event is not sent when tracing probability kicks in (added utest) * added {{minimumVersion}} field to {{Event.Type}} enum * also enhanced {{MessagePayloadTest}} to check behavior with native protocol 4 (CASSANDRA-8553) * also enhances {{debug-cql}} to specify native protocol version + event display NB: {{debug-cql}} did not start when C* is running locally (port 7199 used), since it sources {{cassandra-env.sh}} Push notification when tracing completes for an operation - Key: CASSANDRA-7807 URL: https://issues.apache.org/jira/browse/CASSANDRA-7807 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Tyler Hobbs Assignee: Robert Stupp Priority: Minor Labels: client-impacting, protocolv4 Fix For: 3.0 Attachments: 7807-v2.txt, 7807.txt Tracing is an asynchronous operation, and drivers currently poll to determine when the trace is complete (in a loop with sleeps). Instead, the server could push a notification to the driver when the trace completes. I'm guessing that most of the work for this will be around pushing notifications to a single connection instead of all connections that have registered listeners for a particular event type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader
[ https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382777#comment-14382777 ] Jeremy Hanna commented on CASSANDRA-9048: - I wonder if a bulk loader, if it's not a thick client thing, would need to come in different forms. Small, medium and large. Small: a single file for bootstrapping, not a huge amount of data. cqlsh copy from would work for that. Medium: the tool that this ticket represents. You might have a bunch of files, but you don't want to have to fire up spark to do a simple bulk load. Large/industrial: for a giant amount of data, perhaps on a regular basis. Or if you were going to fire up spark anyway. I think all three have their uses. I see each of them being more favorable in different situations. If we have demand and people willing to maintain each of the three, why wouldn't we consider them? Delimited File Bulk Loader -- Key: CASSANDRA-9048 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Brian Hess Fix For: 3.0 Attachments: CASSANDRA-9048.patch There is a strong need for bulk loading data from delimited files into Cassandra. Starting with delimited files means that the data is not currently in the SSTable format, and therefore cannot immediately leverage Cassandra's bulk loading tool, sstableloader, directly. A tool supporting delimited files much closer matches the format of the data more often than the SSTable format itself, and a tool that loads from delimited files is very useful. In order for this bulk loader to be more generally useful to customers, it should handle a number of options at a minimum: - support specifying the input file or to read the data from stdin (so other command-line programs can pipe into the loader) - supply the CQL schema for the input data - support all data types other than collections (collections is a stretch goal/need) - an option to specify the delimiter - an option to specify comma as the decimal delimiter (for international use casese) - an option to specify how NULL values are specified in the file (e.g., the empty string or the string NULL) - an option to specify how BOOLEAN values are specified in the file (e.g., TRUE/FALSE or 0/1) - an option to specify the Date and Time format - an option to skip some number of rows at the beginning of the file - an option to only read in some number of rows from the file - an option to indicate how many parse errors to tolerate - an option to specify a file that will contain all the lines that did not parse correctly (up to the maximum number of parse errors) - an option to specify the CQL port to connect to (with 9042 as the default). Additional options would be useful, but this set of options/features is a start. A word on COPY. COPY comes via CQLSH which requires the client to be the same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc). This tool should be able to connect to any version of Cassandra (within reason). For example, it should be able to handle 2.0.x and 2.1.x. Moreover, CQLSH's COPY command does not support a number of the options above. Lastly, the performance of COPY in 2.0.x is not high enough to be considered a bulk ingest tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9050) Add debug level logging to Directories.getWriteableLocation()
[ https://issues.apache.org/jira/browse/CASSANDRA-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-9050: Attachment: 9050-2.1.txt 9050-2.0.txt Add debug level logging to Directories.getWriteableLocation() - Key: CASSANDRA-9050 URL: https://issues.apache.org/jira/browse/CASSANDRA-9050 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Fix For: 2.0.14 Attachments: 9050-2.0.txt, 9050-2.1.txt Add some debug level logging to log * blacklisted directories that are excluded * directories not matching requested size -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
[ https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko resolved CASSANDRA-6541. -- Resolution: Fixed New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set. - Key: CASSANDRA-6541 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541 Project: Cassandra Issue Type: Bug Components: Config Reporter: jonathan lacefield Assignee: Brandon Williams Priority: Minor Fix For: 2.1 beta2, 2.0.6, 1.2.16 Attachments: dse_systemlog Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 (maybe earlier), are experiencing issues with GC and JMX where heap slowly fills up overtime until OOM or a full GC event occurs, specifically when CMS is leveraged. Adding: {noformat} JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled {noformat} The the options in cassandra-env.sh alleviates the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-8478) sstableloader NegativeArraySizeException when deserializing to build histograms
[ https://issues.apache.org/jira/browse/CASSANDRA-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko resolved CASSANDRA-8478. -- Resolution: Won't Fix Fix Version/s: (was: 1.2.15) 1.2.x C* releases will no longer happen. Feel free to reopen though if the issue can be reproduced in 2.0 or 2.1. sstableloader NegativeArraySizeException when deserializing to build histograms --- Key: CASSANDRA-8478 URL: https://issues.apache.org/jira/browse/CASSANDRA-8478 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Erick Ramirez When a customer attempts to load sstable data files copied from a production cluster, it returns the following exception: {code} $ sstableloader -d ip -p rpc_port -v KS/CF/ null java.lang.NegativeArraySizeException at org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:266) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:292) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:282) at org.apache.cassandra.io.sstable.SSTableReader.openMetadata(SSTableReader.java:234) at org.apache.cassandra.io.sstable.SSTableReader.openForBatch(SSTableReader.java:162) at org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:100) at java.io.File.list(File.java:1155) at org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:67) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:121) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:66) -pr,--principal kerberos principal -k,--keytab keytab location --ssl-keystore ssl keystore location --ssl-keystore-password ssl keystore password --ssl-keystore-type ssl keystore type --ssl-truststore ssl truststore location --ssl-truststore-password ssl truststore password --ssl-truststore-type ssl truststore type {code} It appears to be failing on this line of code: {code} public EstimatedHistogram deserialize(DataInput dis) throws IOException { int size = dis.readInt(); long[] offsets = new long[size - 1]; here {code} The same error is returned regardless of which data file is attempted. I suspect this may be due to corrupt data files or the way data is written that is not compatible with the sstableloader utility. NOTE: Both source and target clusters are DSE 3.2.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8919) cqlsh return error in querying of CompositeType data
[ https://issues.apache.org/jira/browse/CASSANDRA-8919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8919: - Fix Version/s: (was: 2.1.2) 2.1.4 cqlsh return error in querying of CompositeType data Key: CASSANDRA-8919 URL: https://issues.apache.org/jira/browse/CASSANDRA-8919 Project: Cassandra Issue Type: Bug Components: Tools Environment: SUSE 11 SP3, C* 2.1.2 Reporter: Mark Assignee: Tyler Hobbs Priority: Minor Labels: cqlsh Fix For: 2.1.4 cqlsh return below error when querying CompositeType data. Seems like deserialize_safe is undefined for this CompositeType. Is it a issue need to fix? {code} cassandra@cqlsh:up_data select * from test_stand; Traceback (most recent call last): File /home/mql/bin/cqlsh, line 986, in perform_simple_statement rows = self.session.execute(statement, trace=self.tracing_enabled) File /home/mql/bin/../lib/cassandra-driver-internal-only-2.1.2.zip/cassandra-driver-2.1.2/cassandra/cluster.py, line 1294, in execute result = future.result(timeout) File /home/mql/bin/../lib/cassandra-driver-internal-only-2.1.2.zip/cassandra-driver-2.1.2/cassandra/cluster.py, line 2788, in result raise self._final_exception AttributeError: type object 'CompositeType(UTF8Type, Int32Type)' has no attribute 'deserialize_safe' {code} Pre-condition (in cassandra-cli) {code} create keyspace up_data with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:1}; use up_data; create column family test_stand with column_type = 'Standard' and comparator = 'UTF8Type' and default_validation_class = 'BytesType' and key_validation_class = 'UTF8Type' and column_metadata = [ {column_name : 'UTF8Typefield', validation_class : 'UTF8Type'}, {column_name : 'IntegerTypefield', validation_class : 'IntegerType'}, {column_name : 'CompositeTypefield', validation_class : 'CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.Int32Type)'} ] and compression_options = null; set test_stand ['test_stand1']['UTF8Typefield']='utf8Type'; set test_stand ['test_stand1']['CompositeTypefield']='utf8Type,12'; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9050) Add debug level logging to Directories.getWriteableLocation()
Robert Stupp created CASSANDRA-9050: --- Summary: Add debug level logging to Directories.getWriteableLocation() Key: CASSANDRA-9050 URL: https://issues.apache.org/jira/browse/CASSANDRA-9050 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Add some debug level logging to log * blacklisted directories that are excluded * directories not matching requested size -- This message was sent by Atlassian JIRA (v6.3.4#6332)