Re: [Cas 2.0.2] Looping Repair since activating PasswordAuthenticator
Hi Yuki, thanks for your answer. I still do nt know if it is expected behaviour that Cassandra tries to repair these 1280 ranges everytime I run a nodetool repair on every node? Regards, Dennis Am 03.11.2013 03:27, schrieb Yuki Morishita: Hi Dennis, As you can see in the output, [2013-10-31 09:39:59,811] Starting repair command #1, repairing 1280 ranges for keyspace system_auth repair was trying to repair 1280 ranges. I imagine you are using vnodes, and since Cassandra does repair range by range in almost sequentially, it will take some time. You can specify range to repair using '-st' and '-et' option. For more info, see http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/operations/ops_repair_nodes_c.html. On Thu, Oct 31, 2013 at 3:42 AM, Dennis Schwan dennis.sch...@1und1.de wrote: Hi there, I have used this manual: http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/security/security_config_native_authenticate_t.html to use the PasswordAuthenticator but now everytime I run a nodetool repair it repairs the system_auth keyspace which needs about 10 to 15 minutes. nodetool repair [2013-10-31 09:39:59,623] Nothing to repair for keyspace 'system' [2013-10-31 09:39:59,811] Starting repair command #1, repairing 1280 ranges for keyspace system_auth This is what I get on every node every time i start a repair. Logfile: INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,632 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and /10.30.9.61 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,634 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and /10.30.9.58 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,638 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and /10.30.9.59 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,642 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and /10.30.9.57 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,643 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.61 and /10.30.9.58 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,644 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.61 and /10.30.9.59 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,644 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.61 and /10.30.9.57 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,645 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.58 and /10.30.9.59 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,646 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.58 and /10.30.9.57 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,647 Differencer.java (line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.59 and /10.30.9.57 are consistent for credentials INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,648 RepairSession.java (line 214) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] credentials is fully synced INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,942 RepairSession.java (line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for permissions from /10.30.9.60 INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,047 RepairSession.java (line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for permissions from /10.30.9.61 INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,129 RepairSession.java (line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for permissions from /10.30.9.58 INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,190 RepairSession.java (line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for permissions from /10.30.9.59 Is this expected behaviour? I have only added a new superuser and changed the password of the default superuser so there should not be too much to do at all. Thanks for your help! Dennis -- Dennis Schwan Oracle DBA Mail Core 11 Internet AG | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-8738 E-Mail: dennis.sch...@1und1.de | Web: www.1und1.de Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 6484 Vorstand: Ralph Dommermuth, Frank Einhellinger, Robert Hoffmann, Andreas Hofmann, Markus Huhn, Hans-Henning Kettler, Uwe Lamnek, Jan Oetjen, Christian Würst Aufsichtsratsvorsitzender: Michael Scheeren Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie
Re: Bad Request: No indexed columns present in by-columns clause with Equal operator?
I tested the same and it seems to be so that you cannot such queries with indexed columns. Probably you need to have at least one condition with equal sign in the where clause. I am not sure. You can achieve your goal by defining the primary key as follows: create table test ( employee_id text, employee_name text, value text, last_modified_date timeuuid, primary key (employee_id, last_modified_date) ); and then querying like this: select * from test where last_modified_date mintimeuuid('2013-11-03 13:33:30') and last_modified_date maxtimeuuid('2013-11-05 13:33:45') ALLOW FILTERING; However, that will be slow because it has to do scanning. Therefore you need to say ALLOW FILTERING. Without that you will get a warning: Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING The performance by using Cassandra like this is probably far from optimal. Hannu 2013/11/3 Techy Teck comptechge...@gmail.com Thanks Hannu. I got your point.. But in my example `employee_id` won't be larger than `32767`.. So I am thinking of creating an index on these two columns - create index employee_name_idx on test (employee_name); create index last_modified_date_idx on test (last_modified_date); As the chances of executing the queries on above is very minimal.. Very rarely, we will be executing the above query but if we do, I wanted system to be capable of doing it. Now I can execute the below queries after creating an index - select * from test where employee_name = 'e27'; select employee_id from test where employee_name = 'e27'; select * from test where employee_id = '1'; But I cannot execute the below query which is - Give me everything that has changed within 15 minutes . So I wrote the below query like this - select * from test where last_modified_date mintimeuuid('2013-11-03 13:33:30') and last_modified_date maxtimeuuid('2013-11-03 13:33:45'); But it doesn't run and I always get error as - Bad Request: No indexed columns present in by-columns clause with Equal operator Any thoughts what wrong I am doing here? On Sun, Nov 3, 2013 at 12:43 PM, Hannu Kröger hkro...@gmail.com wrote: Hi, You cannot query using a field that is not indexed in CQL. You have to create either secondary index or create index tables and manage those indexes by yourself and query using those. Since those keys are of high cardinality, usually the recommendation for this kind of use cases is that you create several tables with all the data. 1) A table with employee_id as the primary key. 2) A table with last_modified_at as the primary key (use case 2) 3) A table with employee_name as the primary key (your test query with employee_name 'e27' and use cases 1 3.) Then you populate all those tables with your data and then you use those tables depending on the query. Cheers, Hannu 2013/11/3 Techy Teck comptechge...@gmail.com I have below table in CQL- create table test ( employee_id text, employee_name text, value text, last_modified_date timeuuid, primary key (employee_id) ); I inserted couple of records in the above table like this which I will be inserting in our actual use case scenario as well- insert into test (employee_id, employee_name, value, last_modified_date) values ('1', 'e27', 'some_value', now()); insert into test (employee_id, employee_name, value, last_modified_date) values ('2', 'e27', 'some_new_value', now()); insert into test (employee_id, employee_name, value, last_modified_date) values ('3', 'e27', 'some_again_value', now()); insert into test (employee_id, employee_name, value, last_modified_date) values ('4', 'e28', 'some_values', now()); insert into test (employee_id, employee_name, value, last_modified_date) values ('5', 'e28', 'some_new_values', now()); Now I was doing select query for - give me all the employee_id for employee_name `e27`. select employee_id from test where employee_name = 'e27'; And this is the error I am getting - Bad Request: No indexed columns present in by-columns clause with Equal operator Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh. Is there anything wrong I am doing here? My use cases are in general - 1. Give me everything for any of the employee_name? 2. Give me everything for what has changed in last 5 minutes? 3. Give me the latest employee_id for any of the employee_name? I am running Cassandra 1.2.11
filter using timeuuid column type
Hi, Is it possible to filter records by using timeuuid column types in case the column is not part of the primary key? I tried the followings: [cqlsh 3.1.2 | Cassandra 1.2.10.1 | CQL spec 3.0.0 | Thrift protocol 19.36.0] CREATE TABLE timeuuid_test2( row_key text, time timeuuid, time2 timeuuid, message text, PRIMARY KEY (row_key, time) ) Cqlsh:select * from timeuuid_test2 where time2now(); Bad Request: No indexed columns present in by-columns clause with Equal operator I tried to create the required index: create index timeuuid_test2_idx on timeuuid_test2 (time2); Bad Request: No indexed columns present in by-columns clause with Equal operator The result is the same... If the used column is time then everything is OK. select * from timeuuid_test2 where timenow() ALLOW FILTERING; The question here. Why I can't use the 'time2' column when filtering despite the column is indexed? Thanks, Ferenc
Managing index tables
What is the best way to manage index tables on update/deletion of the indexed data? I have a table containing all kinds of data fora user, i.e. name, address, contact data, company data etc. Key to this table is the user ID. I also maintain about a dozen index tables matching my queries, like name, email address, company D.U.N.S number, permissions the user has, etc. These index tables contain the user IDs matching the search key as column names, with the column values left empty. Whenever a user is deleted or updated I have to make sure to update the index tables, i.e. if the permissions of a user changes I have to remove the user ID from the rows matching the permission he no longer has. My problem is to find all matching entries, especially for data I no longer have. My solution so far is to keep a separate table to keep track of all index tables and keys the user can be found in. In the case mentioned I look up the keys for the permissions table, remove the user ID from there, then remove the entry in the keys table. This works so far (in production for more than a year and a half), and it also allows me to clean up after something has gone wrong. But still, all this additional level of meta information adds a lot of complexity. I was wondering wether there is some kind of pattern that addresses my problem. I found lots of information saying that creating the index tables is the way to go, but nobody ever mentions maintaining the index tables. tia, Thomas
Re: Cass 1.1.11 out of memory during compaction ?
If i do that, wouldn't I need to scrub my sstables ? Takenori Sato ts...@cloudian.com wrote: Try increasing column_index_size_in_kb. A slice query to get some ranges(SliceFromReadCommand) requires to read all the column indexes for the row, thus could hit OOM if you have a very wide row. On Sun, Nov 3, 2013 at 11:54 PM, Oleg Dulin oleg.du...@gmail.com wrote: Cass 1.1.11 ran out of memory on me with this exception (see below). My parameters are 8gig heap, new gen is 1200M. ERROR [ReadStage:55887] 2013-11-02 23:35:18,419 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[ReadStage:55887,5,main] java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:323) at org.apache.cassandra.utils.ByteBufferUtil.read( ByteBufferUtil.java:398)at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:380) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:88) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:83) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:73) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37) at org.apache.cassandra.db.columniterator.IndexedSliceReader$ IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:179)at org.apache.cassandra.db.columniterator.IndexedSliceReader. computeNext(IndexedSliceReader.java:121)at org.apache.cassandra.db.columniterator.IndexedSliceReader. computeNext(IndexedSliceReader.java:48)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.columniterator. SSTableSliceIterator.hasNext(SSTableSliceIterator.java:116)at org.apache.cassandra.utils.MergeIterator$Candidate. advance(MergeIterator.java:147)at org.apache.cassandra.utils.MergeIterator$ManyToOne. advance(MergeIterator.java:126)at org.apache.cassandra.utils.MergeIterator$ManyToOne. computeNext(MergeIterator.java:100)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.filter.SliceQueryFilter. collectReducedColumns(SliceQueryFilter.java:117)at org.apache.cassandra.db.filter.QueryFilter. collateColumns(QueryFilter.java:140) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:292) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1362) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1224) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1159) at org.apache.cassandra.db.Table.getRow(Table.java:378)at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.db.ReadVerbHandler.doVerb( ReadVerbHandler.java:51)at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Any thoughts ? This is a dual data center set up, with 4 nodes in each DC and RF=2 in each. -- Regards, Oleg Dulin a href=http://www.olegdulin.com;http://www.olegdulin.com/a
Re: IllegalStateException when bootstrapping new nodes
No one can find something useful in our logs ? :( -- Cyril SCETBON On 29 Oct 2013, at 16:38, Cyril Scetbon cyril.scet...@free.fr wrote: Sorry but as the link is bad here is the good one : http://www.sendspace.com/file/7p81lz
Duplicate hard link - Cassandra 1.2.9
Cassandra 1.2.9, embedded into the RHQ 4.9 project. I'm getting the following: Caused by: java.lang.RuntimeException: Tried to create duplicate hard link to /data05/rhq/data/system/NodeIdInfo/snapshots/1383587405678/system-NodeIdInfo-ic- 1-TOC.txt at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:70) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1081) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1567) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1612) at org.apache.cassandra.db.Table.snapshot(Table.java:194) at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:2203) Clearing the snapshot directory doesn't seem to fix this issue. Restarting the node doesn't fix it. This is obviously a bug, but I'm not sure what to do about it. I don't have enough context to know what this means or what to do to fix it.
1.1.11: system keyspace is filling up
I have a dual DC setup, 4 nodes, RF=4 in each. The one that is used as primary has its system keyspace fill up with 200 gigs of data, majority of which is hints. Why does this happen ? How can I clean it up ? -- Regards, Oleg Dulin http://www.olegdulin.com
CFP - NoSQL Room FOSDEM Cassandra Community
Hi all, We're pleased to announce the call for participation for the NoSQL devroom, returning after a great last year. NoSQL is an encompassing term that covers a multitude of different and interesting database solutions. As the interest in NoSQL continues to grow, we are looking for talks on any open source NoSQL database or related topic. Speaking slots are 25 or 50 minutes. To propose a talk please go to: http://bit.ly/nosql-devroom-2013 https://penta.fosdem.org/submission/FOSDEM14 As FOSDEM is a friendly open source conference, please refrain from slagging matches about each other’s projects. Keep it respectful, keep it non-commercial, and remember that all decks are subject to approval. http://bit.ly/nosql-devroom-2013 If you do not want to give a talk yourself but have ideas for NoSQL topics, send them to the mailing list at nosql-devr...@lists.fosdem.org. Know someone who might be interested in the devroom? Please forward them this email on our behalf. Want to help out but don’t know how? Contact us! The devroom is scheduled for Sunday, February 2nd and has approx 80 seats. The call for proposals is open until Dec 13th and speakers will be notified by December 20th. The final schedule will then be announced by January 10th. Any changes will be announced on the mailing list: https://lists.fosdem.org/listinfo/nosql-devroom *Original announcement went on : http://www.lczajkowski.com/2013/10/10/cfp-for-nosql-devroom-at-fosdem/ http://www.lczajkowski.com/2013/10/10/cfp-for-nosql-devroom-at-fosdem/* Laura
Re: Duplicate hard link - Cassandra 1.2.9
On Mon, Nov 4, 2013 at 10:08 AM, Elias Ross gen...@noderunner.net wrote: Cassandra 1.2.9, embedded into the RHQ 4.9 project. I'm getting the following: Caused by: java.lang.RuntimeException: Tried to create duplicate hard link to /data05/rhq/data/system/NodeIdInfo/snapshots/1383587405678/system-NodeIdInfo-ic- Someone else having a similar issue, but on upgrade to 2.0.x from 1.2.10. http://mail-archives.apache.org/mod_mbox/cassandra-user/201309.mbox/%3C00b801ceb9ef$e3cc6a70$ab653f50$@struq.com%3E In that case, it was : https://issues.apache.org/jira/browse/CASSANDRA-6093 But I'm not sure that applies to your issue? If not, file a JIRA! :D =Rob
Re: 1.1.11: system keyspace is filling up
On Mon, Nov 4, 2013 at 11:34 AM, Oleg Dulin oleg.du...@gmail.com wrote: I have a dual DC setup, 4 nodes, RF=4 in each. The one that is used as primary has its system keyspace fill up with 200 gigs of data, majority of which is hints. Why does this happen ? How can I clean it up ? If you have this many hints, you probably have flapping / frequent network partition, or very overloaded nodes. If you compare the number of hints to the number of dropped messages, that would be informative. If you're hinting because you're dropping, increase capacity. If you're hinting because of partition, figure out why there's so much partition. WRT cleaning up hints, they will automatically be cleaned up eventually, as long as they are successfully being delivered. If you need to manually clean them up you can truncate system.hints keyspace. =Rob
Re: Strange exception when storing heavy data in cassandra 2.0.0...
On Fri, Nov 1, 2013 at 10:29 PM, Krishna Chaitanya bnsk1990r...@gmail.comwrote: I am newbie to the Cassandra world. I am currently using Cassandra 2.0.0 with thrift 0.8.0 for storing netflow packets using libQtCassandra library. ... Is this a known issue because it did not occur when we were using Cassandra-1.2.6 and previous versions with pycassa library for accessing the database. ... How can I avoid this exception and also is there any way in which I can get back my node to running state even if it means reinstalling cassandra??? Thank You in advance for any help. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ You could try upgrading to 2.0.2, or downgrading back to 1.2.6. The former is likely to be easier/more possible than the latter, but may or may not resolve your problem. The latter may require a dump/load or reload of your cluster's data, if sstables have been upgraded to a 2.0 version. note : I have not looked for your particular issue in the apache JIRA, there may be value to you doing so. =Rob
Re: Duplicate hard link - Cassandra 1.2.9
Thanks Robert. CASSANDRA-6298 Is there any way to maybe do a workaround? I guess the thinking I have is the duplicate hard link is probably pretty harmless and getting rid of the check would at least get me past this issue.
Re: Cass 1.1.11 out of memory during compaction ?
I would go with cleanup. Be careful for this bug. https://issues.apache.org/jira/browse/CASSANDRA-5454 On Mon, Nov 4, 2013 at 9:05 PM, Oleg Dulin oleg.du...@gmail.com wrote: If i do that, wouldn't I need to scrub my sstables ? Takenori Sato ts...@cloudian.com wrote: Try increasing column_index_size_in_kb. A slice query to get some ranges(SliceFromReadCommand) requires to read all the column indexes for the row, thus could hit OOM if you have a very wide row. On Sun, Nov 3, 2013 at 11:54 PM, Oleg Dulin oleg.du...@gmail.com wrote: Cass 1.1.11 ran out of memory on me with this exception (see below). My parameters are 8gig heap, new gen is 1200M. ERROR [ReadStage:55887] 2013-11-02 23:35:18,419 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[ReadStage:55887,5,main] java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:323) at org.apache.cassandra.utils.ByteBufferUtil.read( ByteBufferUtil.java:398)at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:380) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:88) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:83) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:73) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37) at org.apache.cassandra.db.columniterator.IndexedSliceReader$ IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:179)at org.apache.cassandra.db.columniterator.IndexedSliceReader. computeNext(IndexedSliceReader.java:121)at org.apache.cassandra.db.columniterator.IndexedSliceReader. computeNext(IndexedSliceReader.java:48)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.columniterator. SSTableSliceIterator.hasNext(SSTableSliceIterator.java:116)at org.apache.cassandra.utils.MergeIterator$Candidate. advance(MergeIterator.java:147)at org.apache.cassandra.utils.MergeIterator$ManyToOne. advance(MergeIterator.java:126)at org.apache.cassandra.utils.MergeIterator$ManyToOne. computeNext(MergeIterator.java:100)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.filter.SliceQueryFilter. collectReducedColumns(SliceQueryFilter.java:117)at org.apache.cassandra.db.filter.QueryFilter. collateColumns(QueryFilter.java:140) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:292) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1362) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1224) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1159) at org.apache.cassandra.db.Table.getRow(Table.java:378)at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.db.ReadVerbHandler.doVerb( ReadVerbHandler.java:51)at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Any thoughts ? This is a dual data center set up, with 4 nodes in each DC and RF=2 in each. -- Regards, Oleg Dulin a href=http://www.olegdulin.com;http://www.olegdulin.com /a
Re: CF name length restrictions (CASSANDRA-4157 and CASSANDRA-4110)
My understanding of CASSANDRA-4110 is that the file name (not the total path length) has to be = 255 chars long. On not windows platforms in 1.1.0+ you should be ok with KS + CF names that combined go up to about 230 chars. Leaving room for the extra few things Cassandra dds to the SStable file names. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 1/11/2013, at 5:54 am, Peter Sanford psanf...@nearbuysystems.com wrote: We're working on upgrading from 1.0.12 to 1.1.12. After upgrading a test node I ran into CASSANDRA-4157 which restricts the max length of CF names to = 48 characters. It looks like CASSANDRA-4110 will allow us to upgrade and keep our existing long CF names, but we won't be able to create new CFs with names longer than 48 chars. Is there any reason that the logic from 4110 wasn't also applied to the 4157 code path? (Our naming convention results in a lot of materialized view CFs that have names 48 characters.) -psanford
Re: Storage management during rapid growth
However, when monitoring the performance of our cluster, we see sustained periods - especially during repair/compaction/cleanup - of several hours where there are 2000 IOPS. If the IOPS are there compaction / repair / cleanup will use them if the configuration allows it. If there are not there and the configuration matches the resources the only issue will be things take longer (assuming the HW can handle the throughput). 2) Move some nodes to a SAN solution, ensuring that there is a mix of storage, drives, IMHO you will have a terrible time and regret the decision. Performance in anger rarely matches local disks and when someone decides the SAN needs to go through a maintenance process say goodbye to your node. Also you will need very good network links. Cassandra is designed for shared nothing architecture, it’s best to embrace that. 1) Has anyone moved from SSDs to spinning-platter disks, or managed a cluster that contained both? Do the numbers we're seeing exaggerate the performance hit we'd see if we moved to spinners? Try to get a feel for the general IOPS used for reads without compaction etc running. Also for the bytes going into the cluster on the rpc / native binary interface. 2) Have you successfully used a SAN or a hybrid SAN solution (some local, some SAN-based) to dynamically add storage to the cluster? What type of SAN have you used, and what issues have you run into? I’ve worked with people who have internal SANS and those that have used EBS. I would not describe either solution as optimal. The issues are performance under load, network contention, SLA / consistency. 3) Am I missing a way of economically scaling storage? version 1.2+ has better support for fat nodes, nodes with up to 5TB of data via: * JBOD: mount each disk independently and add it to adata_file_directories . Cassandra will balance the write load between disks and have one flush thread per data directory, I’ve heard this gives good performance with HDD's. This will give you 100% of the raw disk capacity and mean a single disk failure does necessitate a node rebuild. * disk failure: set the disk_failure_policy to best_effort or stop so the node can handle disk failure https://github.com/apache/cassandra/blob/cassandra-1.2/conf/cassandra.yaml#L125 * have good networking in place so you can rebuild a failed node, either completely or from a failed disk. * use vnodes so that as the number of nodes grows the time to rebuild a failed node drops. I would be a little uneasy about very high node loads with only three nodes. The main concern is how long it will take to replace a node that completely fails. I’ve also seen people have a good time moving from SSD to 12 fast disks in a RAID10 config. You can mix HDD and SSD’s and have some hot CF’s on the SSD and others on the HDD. Hope that helps. - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 1/11/2013, at 10:01 am, Franc Carter franc.car...@sirca.org.au wrote: I can't comment on the technical question, however one thing I learnt with managing the growth of data is that the $/GB of tends to drop at a rate that can absorb a moderate proportion of the increase in cost due to the increase in size of data. I'd recommend having a wet-finger-in-the-air stab at projecting the growth in data sizes versus the historical trends in the decease in cost of storage. cheers On Fri, Nov 1, 2013 at 7:15 AM, Dave Cowen d...@luciddg.com wrote: Hi, all - I'm currently managing a small Cassandra cluster, several nodes with local SSD storage. It's difficult for to forecast the growth of the Cassandra data over the next couple of years for various reasons, but it is virtually guaranteed to grow substantially. During this time, there may be times where it is desirable to increase the amount of storage available to each node, but, assuming we are not I/O bound, keep from expanding the cluster horizontally with additional nodes that have local storage. In addition, expanding with local SSDs is costly. My colleagues and I have had several discussions of a couple of other options that don't involve scaling horizontally or adding SSDs: 1) Move to larger, cheaper spinning-platter disks. However, when monitoring the performance of our cluster, we see sustained periods - especially during repair/compaction/cleanup - of several hours where there are 2000 IOPS. It will be hard to get to that level of performance in each node with spinning platter disks, and we'd prefer not to take that kind of performance hit during maintenance operations. 2) Move some nodes to a SAN solution, ensuring that there is a mix of storage, drives, LUNs and RAIDs so that there isn't a single point of failure. While we're aware that this is frowned on in the Cassandra community due to Cassandra's
Re: Heap almost full
When we analyzed the heap, almost all of it was memtables. What were the top classes ? I would normally expect an OOM in pre 1.2 days to be the result of bloom filters, compaction meta data and index samples. Is there any known issue with 1.1.5 which causes memtable_total_space_in_mb not to be respected, or not defaulting to 1/3rd of the heap size? Nothing I can remember. We estimate the in memory size of the memtables using the live ratio. That’s been pretty good for a while now, but you may want to check the change log for changes there. The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 1TB disks (regular). With all those cores grab the TLAB setting from the 1.2 cassandra-env.sh file. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 1/11/2013, at 2:59 pm, Arindam Barua aba...@247-inc.com wrote: Thank you for your responses. In another recent test, the heap actually got full, and we got an out of memory error. When we analyzed the heap, almost all of it was memtables. Is there any known issue with 1.1.5 which causes memtable_total_space_in_mb not to be respected, or not defaulting to 1/3rd of the heap size? Or is it possible that the load in the test is that high that Cassandra is not able to keep flushing even though it starts the process when memtable_total_space_in_mb is 1/3rd of the heap? We recently switched to LeveledCompaction, however, when we got the earlier heap warning, that was running on SizeTiered. The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 1TB disks (regular). Earlier tests were run on lesser hardware with the same load, but there was no memory problem. We are running more tests to check if this is always reproducible. Answering some of the earlier questions if it helps: We have Cassandra 1.1.5 running in production. Upgrading to the latest 1.2.x release is on the roadmap, but till then this needs to be figured out. - How many data do you got per node ? We are running into these errors while running tests in QA starting with 0 load. These are around 4 hr tests which end up adding under 1 GB of data on each node of a 4-node ring, or a 2-node ring. - What is the value of the index_intval (cassandra.yaml) ? It's the default value of 128. Thanks, Arindam -Original Message- From: Aaron Morton [mailto:aa...@thelastpickle.com] Sent: Monday, October 28, 2013 12:09 AM To: Cassandra User Subject: Re: Heap almost full 1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1: WARN GCInspector.java (line 145) Heap is 0.8287082580489245 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically This means that the CMS GC was unable to free memory quickly, you've not run out but may do under heavy load. CMS uses CPU resources to do it's job, how much CPU do you have available ? To check the behaviour of the CMS collector using JConsole or another tool to watch the heap size, you should see a nice saw tooth graph. It should gradually grow then drop quickly to below 3ish GB. If the size of CMS is not low enough you will spend more time in GC. You may also want to adjust flush_largest_memtables_at to be .8 to give CMS a chance to do it's work. It starts at .75 In 1.2+ bloomfilters are off-heap, you can use vnodes... +1 for 1.2 with off heap bloom filters. - increasing the heap to 10GB. -1 Unless you have a node under heavy memory problems, pre 1.2 with 1+billion rows and lots of bloom filters, increasing the heap is not the answer. It will increase the time taken for ParNew CMS and in kicks the problem down the road. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 26/10/2013, at 8:32 am, Alain RODRIGUEZ arodr...@gmail.com wrote: If you are starting with Cassandra I really advice you to start with 1.2.11 In 1.2+ bloomfilters are off-heap, you can use vnodes... I summed up the bloom filter usage reported by nodetool cfstats in all the CFs and it was under 50 MB. This is quite a small value. Is there no error in your conversion from Bytes read in cfstats ? If you are trying to understand this could you tell us : - How many data do you got per node ? - What is the value of the index_intval (cassandra.yaml) ? If you are trying to fix this, you can try : - changing the memtable_total_space_in_mb to 1024 - increasing the heap to 10GB. Hope this will help somehow :). Good luck 2013/10/16 Arindam Barua aba...@247-inc.com During performance testing being run on
Re: How to generate tokens for my two node Cassandra cluster?
For a while now the binary distribution as included a tool to calculate tokens: aarons-MBP-2011:apache-cassandra-1.2.11 aaron$ tools/bin/token-generator Token Generator Interactive Mode How many datacenters will participate in this Cassandra cluster? 1 How many nodes are in datacenter #1? 3 DC #1: Node #1:0 Node #2: 56713727820156410577229101238628035242 Node #3: 113427455640312821154458202477256070484 Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 2/11/2013, at 3:43 am, Ray Sutton ray.sut...@gmail.com wrote: Your quotes need to be escaped: python -c num=2; print \\n\.join([(\token %d: %d\ %(i,(i*(2**127)/num))) for i in range(0,num)]) -- Ray //o-o\\ On Fri, Nov 1, 2013 at 10:36 AM, Peter Sanford psanf...@nearbuysystems.com wrote: I can't tell you why that one-liner isn't working, but you can try http://www.cassandraring.com for generating balanced tokens. On Thu, Oct 31, 2013 at 11:59 PM, Techy Teck comptechge...@gmail.com wrote: I am trying to setup two node Cassandra Cluster on windows machine. I have basically two windows machine and I was following this datastax tutorial (http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows) Whenever I use the below command to get the token number from the above tutorial - python -c num=2; print \n.join([(token %d: %d %(i,(i*(2**127)/num))) for i in range(0,num)]) I always get this error - C:\Users\usernamepython -c num=2; print \n.join([(token %d: %d %(i,(i*(2**127)/num))) for i in range(0,num)]) File string, line 1 num=2; print \n.join([(token %d: %d %(i,(i*(2**127)/num))) for i in range(0,num)]) ^ SyntaxError: invalid syntax
Re: Not able to form a Cassandra cluster of two nodes in Windows?
My First Node details are - initial_token: 0 seeds: 10.0.0.4 listen_address: 10.0.0.4 #IP of Machine - A (Wireless LAN adapter Wireless Network Connection) rpc_address: 10.0.0.4 My Second Node details are - initial_token: 0 seeds: 10.0.0.4 listen_address: 10.0.0.7 #IP of Machine - B (Wireless LAN adapter Wireless Network Connection) rpc_address: 10.0.0.7 You cannot have two nodes with the same tokens, is there an error in the logs ? if you are just starting put the simple thing is delete all the data and restart the machines. Hope that helps. - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 2/11/2013, at 6:02 am, Techy Teck comptechge...@gmail.com wrote: In my case, both of my laptop are running Windows 7 64 bit.. Not sure what's the problem... On Fri, Nov 1, 2013 at 4:48 AM, Aaron Mintz aaron.m.mi...@gmail.com wrote: One issue I ran into that produced similar symptoms: if you have internode_compression turned on without the proper snappy library available for your architecture (i had 64-bit linux), starting up will fail to link the nodes. It'll also be silent unless you set a certain class logging level to DEBUG, but it basically presented as if nodes would each form their own single-machine ring On Fri, Nov 1, 2013 at 3:52 AM, Techy Teck comptechge...@gmail.com wrote: I am trying to setup two nodes of Cassandra cluster on my windows machine. Basically, I have two windows machine. In both of my machine, I have installed Cassandra 1.2.11 from Datastax. Now I was following this [tutorial](http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows) to setup two node Cassandra Cluster. After installing Cassandra into those two machines, I stopped the services for the Cassandra server, DataStax OpsCenter, and the DataStax OpsCenter agent in those two machines.. And then I started making changes in the yaml file - My First Node details are - initial_token: 0 seeds: 10.0.0.4 listen_address: 10.0.0.4 #IP of Machine - A (Wireless LAN adapter Wireless Network Connection) rpc_address: 10.0.0.4 My Second Node details are - initial_token: 0 seeds: 10.0.0.4 listen_address: 10.0.0.7 #IP of Machine - B (Wireless LAN adapter Wireless Network Connection) rpc_address: 10.0.0.7 Both of my serves gets started up properly after I start the services for server. But they are not forming a cluster of two nodes somehow? Is there anything I am missing here? Machine-A Nodetool Information- Datacenter: datacenter1 == Replicas: 1 Address RackStatus State LoadOwns Token 10.0.0.4 rack1 Up Normal 212.1 KB100.00% 5264744098649860606 Machine-B Nodetool Information- Starting NodeTool Datacenter: datacenter1 == Replicas: 1 Address RackStatus State LoadOwns Token 10.0.0.7 rack1 Up Normal 68.46 KB100.00% 407804996740764696