Re: Nodetool snapshot, consistency and replication
Ok, thank you. 2012/4/2 Rob Coli rc...@palominodb.com On Mon, Apr 2, 2012 at 9:19 AM, R. Verlangen ro...@us2.nl wrote: - 3 node cluster - RF = 3 - fully consistent (not measured, but let's say it is) Is it true that when I take a snaphot at only one of the 3 nodes this contains all the data in the cluster (at least 1 replica)? Yes. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb -- With kind regards, Robin Verlangen www.robinverlangen.nl
Counter Column
I have encountered the following piece of information regarding the use of ‘Counter Column’ in Cassandra: “If a write fails unexpectedly (timeout or loss of connection to the coordinator node) the client will not know if the operation has been performed. A retry can result in an over count” (- quoted from http://wiki.apache.org/cassandra/Counters). is it still relevant? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Counter-Column-tp7432010p7432010.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Counter Column
On Tue, Apr 3, 2012 at 9:11 AM, Avi-h avih...@gmail.com wrote: I have encountered the following piece of information regarding the use of ‘Counter Column’ in Cassandra: “If a write fails unexpectedly (timeout or loss of connection to the coordinator node) the client will not know if the operation has been performed. A retry can result in an over count” (- quoted from http://wiki.apache.org/cassandra/Counters). is it still relevant? It is (there is an open ticket to fix, https://issues.apache.org/jira/browse/CASSANDRA-2495, but to be honest we don't really have a good solution to fix it so far). -- Sylvain
Re: Error Replicate on write
It is, but it doesn't load it. I tried the default package manager version (3.2), the 3.3 and the 3.4 version and this node always say that was unable to load the JNA. I put the jna.jar inside /.../cassandra/lib/ where the other .jar files are. I have other nodes with the same config (without JNA) and they don't trigger this error. On 04/02/2012 11:26 PM, aaron morton wrote: Is JNA.jar in the path ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 2/04/2012, at 10:11 PM, Carlos Juzarte Rolo wrote: Hi, I've been using cassandra for a while, but after a upgrade to 1.0.7, every machine kept running perfectly. Well, except one that constantly throws this error: ERROR [ReplicateOnWriteStage:39] 2012-04-02 12:02:55,131 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReplicateOnWriteStage:39,5,main] java.lang.NoClassDefFoundError: Could not initialize class org.apache.cassandra.cache.FreeableMemory at org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:92) at org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:154) at org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:63) at org.apache.cassandra.db.ColumnFamilyStore.cacheRow(ColumnFamilyStore.java:1170) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1194) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1151) at org.apache.cassandra.db.Table.getRow(Table.java:375) at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58) at org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:99) at org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:544) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1223) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) The machine does run, but nodetool doesn't connect to the machine for example. Any ideia from what could be trigerring this? Thanks,
Repair in loop?
Hello, I'm doing some test with cassandra 1.0.8 using multiple data directories with individual disks in a three node cluster (replica=3). One of the tests was to replace a couple of disks and start a repair process. It started ok and refilled the disks but I noticed that after the recovery process finished, it started a new one again: INFO 09:34:42,481 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_6f is fully synced (6 remaining column family to sync for this session) INFO 09:41:55,288 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_0d is fully synced (5 remaining column family to sync for this session) INFO 09:42:50,169 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_07 is fully synced (4 remaining column family to sync for this session) INFO 09:45:02,743 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_5a is fully synced (3 remaining column family to sync for this session) INFO 09:48:03,010 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_da is fully synced (2 remaining column family to sync for this session) INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_13 is fully synced (254 remaining column family to sync for this session) INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_8b is fully synced (253 remaining column family to sync for this session) INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_31 is fully synced (252 remaining column family to sync for this session) INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_0c is fully synced (251 remaining column family to sync for this session) INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_1b is fully synced (250 remaining column family to sync for this session) Is this normal? To me it doesn't make much sense. Regards, Nuno
Re: Counter Column
Again, it will be relevant until CASSANDRA-2495 is fixed. Until then (then being undefined so far), it affects all version that have counters (including 1.0.8). -- Sylvain On Tue, Apr 3, 2012 at 12:23 PM, Avi-h avih...@gmail.com wrote: this bug is for 0.8 beta 1, is it also relevant for 1.0.8? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Counter-Column-tp7432010p7432450.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Repair in loop?
It just means that you have lots of column family and repair does 1 column family at a time. Each line is just saying it's done with one of the column family. There is nothing wrong, but it does mean the repair is *not* done yet. -- Sylvain On Tue, Apr 3, 2012 at 12:28 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Hello, I’m doing some test with cassandra 1.0.8 using multiple data directories with individual disks in a three node cluster (replica=3). One of the tests was to replace a couple of disks and start a repair process. It started ok and refilled the disks but I noticed that after the recovery process finished, it started a new one again: INFO 09:34:42,481 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_6f is fully synced (6 remaining column family to sync for this session) INFO 09:41:55,288 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_0d is fully synced (5 remaining column family to sync for this session) INFO 09:42:50,169 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_07 is fully synced (4 remaining column family to sync for this session) INFO 09:45:02,743 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_5a is fully synced (3 remaining column family to sync for this session) INFO 09:48:03,010 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_da is fully synced (2 remaining column family to sync for this session) INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_13 is fully synced (254 remaining column family to sync for this session) INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_8b is fully synced (253 remaining column family to sync for this session) INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_31 is fully synced (252 remaining column family to sync for this session) INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_0c is fully synced (251 remaining column family to sync for this session) INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_1b is fully synced (250 remaining column family to sync for this session) Is this normal? To me it doesn’t make much sense. Regards, Nuno
RE: Repair in loop?
Thank you for your response. My question is that it is repeating the same column family: INFO 19:12:24,656 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) [...] INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) What I was showing in my previous email is the point where it restarted: INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) Notice the 1 remaining column family to sync for this session indication changes to 255 remaining column family to sync for this session. Regards, Nuno Jordão -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: terça-feira, 3 de Abril de 2012 11:36 To: user@cassandra.apache.org Subject: Re: Repair in loop? Importance: Low It just means that you have lots of column family and repair does 1 column family at a time. Each line is just saying it's done with one of the column family. There is nothing wrong, but it does mean the repair is *not* done yet. -- Sylvain On Tue, Apr 3, 2012 at 12:28 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Hello, I'm doing some test with cassandra 1.0.8 using multiple data directories with individual disks in a three node cluster (replica=3). One of the tests was to replace a couple of disks and start a repair process. It started ok and refilled the disks but I noticed that after the recovery process finished, it started a new one again: INFO 09:34:42,481 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_6f is fully synced (6 remaining column family to sync for this session) INFO 09:41:55,288 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_0d is fully synced (5 remaining column family to sync for this session) INFO 09:42:50,169 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_07 is fully synced (4 remaining column family to sync for this session) INFO 09:45:02,743 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_5a is fully synced (3 remaining column family to sync for this session) INFO 09:48:03,010 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_da is fully synced (2 remaining column family to sync for this session) INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_13 is fully synced (254 remaining column family to sync for this session) INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_8b is fully synced (253 remaining column family to sync for this session) INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_31 is fully synced (252 remaining column family to sync for this session) INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_0c is fully synced (251 remaining column family to sync for this session) INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_1b is fully synced (250 remaining column family to sync for this session) Is this normal? To me it doesn't make much sense. Regards, Nuno
Write performance compared to Postgresql
Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff require 'rubygems' require 'cassandra-cql' require 'simple_uuid' require 'benchmark' require 'json' require 'active_record' type = 'postgres' #type = 'cassandra' puts type ActiveRecord::Base.establish_connection( #:adapter = jdbcpostgresql, :adapter = postgresql, :host = meta01, :username = postgres, :database = test) db = nil if type == 'postgres' db = ActiveRecord::Base.connection else db = CassandraCQL::Database.new('meta01:9160', {:keyspace = 'PlayLog'}) end def cql_insert(table, key, key_value) cql = INSERT INTO #{table} (KEY, cql key_value.keys.join(', ') cql ) VALUES ('#{key}', cql (key_value.values.map {|x| '#{x}' }).join(', ') cql ) cql end def quote_value(x, type=nil) if x.nil? return 'NULL' else return '#{x}' end end def sql_insert(table, key_value) key_value.delete('time') cql = INSERT INTO #{table} ( cql key_value.keys.join(', ') cql ) VALUES ( cql (key_value.values.map {|x| quote_value(x) }).join(', ') cql ) cql end # load 100 hashes of log details rows = [] File.open('data.json') do |f| rows = JSON.load(f) end bm = Benchmark.measure do (1..1).each do |i| row = rows[i%100] if type == 'postgres' fred = sql_insert('playlog', row) else fred = cql_insert('playlog', SimpleUUID::UUID.new.to_guid, row) end db.execute(fred) end end puts bm
Re: Write performance compared to Postgresql
Hi Jeff, Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve. Jake On Apr 3, 2012, at 7:08 AM, Jeff Williams je...@wherethebitsroam.com wrote: Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff cassandra-bm.rb
Re: Repair in loop?
On Tue, Apr 3, 2012 at 12:52 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Thank you for your response. My question is that it is repeating the same column family: INFO 19:12:24,656 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) [...] INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) What I was showing in my previous email is the point where it restarted: Ok, then it's likely because because those correspond to different ranges of the ring. Unless you've started the repair with nodetool repair -pr, the repair will try to repair every range of the node and each repair will a different repair session. I'll admit though that printing which range is being repaired would have avoid that confusion. -- Sylvain INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) Notice the 1 remaining column family to sync for this session indication changes to 255 remaining column family to sync for this session. Regards, Nuno Jordão -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: terça-feira, 3 de Abril de 2012 11:36 To: user@cassandra.apache.org Subject: Re: Repair in loop? Importance: Low It just means that you have lots of column family and repair does 1 column family at a time. Each line is just saying it's done with one of the column family. There is nothing wrong, but it does mean the repair is *not* done yet. -- Sylvain On Tue, Apr 3, 2012 at 12:28 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Hello, I'm doing some test with cassandra 1.0.8 using multiple data directories with individual disks in a three node cluster (replica=3). One of the tests was to replace a couple of disks and start a repair process. It started ok and refilled the disks but I noticed that after the recovery process finished, it started a new one again: INFO 09:34:42,481 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_6f is fully synced (6 remaining column family to sync for this session) INFO 09:41:55,288 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_0d is fully synced (5 remaining column family to sync for this session) INFO 09:42:50,169 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_07 is fully synced (4 remaining column family to sync for this session) INFO 09:45:02,743 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_5a is fully synced (3 remaining column family to sync for this session) INFO 09:48:03,010 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_da is fully synced (2 remaining column family to sync for this session) INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_13 is fully synced (254 remaining column family to sync for this session) INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_8b is fully synced (253 remaining column family to sync for this session) INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_31 is fully synced (252 remaining column family to sync for this session) INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_0c is fully synced (251 remaining column family to sync for this session) INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_1b is fully synced (250 remaining column family to sync for this session) Is this normal? To me it doesn't make much sense. Regards, Nuno
RE: Repair in loop?
Ok, Thank you! :) One last question then, is nodetool repair -pr enough to recover a failed node? Nuno -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: terça-feira, 3 de Abril de 2012 12:38 To: user@cassandra.apache.org Subject: Re: Repair in loop? Importance: Low On Tue, Apr 3, 2012 at 12:52 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Thank you for your response. My question is that it is repeating the same column family: INFO 19:12:24,656 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) [...] INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) What I was showing in my previous email is the point where it restarted: Ok, then it's likely because because those correspond to different ranges of the ring. Unless you've started the repair with nodetool repair -pr, the repair will try to repair every range of the node and each repair will a different repair session. I'll admit though that printing which range is being repaired would have avoid that confusion. -- Sylvain INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) Notice the 1 remaining column family to sync for this session indication changes to 255 remaining column family to sync for this session. Regards, Nuno Jordão -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: terça-feira, 3 de Abril de 2012 11:36 To: user@cassandra.apache.org Subject: Re: Repair in loop? Importance: Low It just means that you have lots of column family and repair does 1 column family at a time. Each line is just saying it's done with one of the column family. There is nothing wrong, but it does mean the repair is *not* done yet. -- Sylvain On Tue, Apr 3, 2012 at 12:28 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Hello, I'm doing some test with cassandra 1.0.8 using multiple data directories with individual disks in a three node cluster (replica=3). One of the tests was to replace a couple of disks and start a repair process. It started ok and refilled the disks but I noticed that after the recovery process finished, it started a new one again: INFO 09:34:42,481 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_6f is fully synced (6 remaining column family to sync for this session) INFO 09:41:55,288 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_0d is fully synced (5 remaining column family to sync for this session) INFO 09:42:50,169 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_07 is fully synced (4 remaining column family to sync for this session) INFO 09:45:02,743 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_5a is fully synced (3 remaining column family to sync for this session) INFO 09:48:03,010 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_da is fully synced (2 remaining column family to sync for this session) INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_13 is fully synced (254 remaining column family to sync for this session) INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_8b is fully synced (253 remaining column family to sync for this session) INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_31 is fully synced (252 remaining column family to sync for this session) INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_0c is fully synced (251 remaining column family to sync for this session) INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_1b is fully synced (250 remaining column family to sync for this session) Is this normal? To me it doesn't make much sense. Regards, Nuno
Re: Repair in loop?
On Tue, Apr 3, 2012 at 1:55 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Ok, Thank you! :) One last question then, is nodetool repair -pr enough to recover a failed node? It's not. It's more for doing repair of full cluster (to ensure the all nodes are in synch), in which case you'd want to run nodetool repair -pr on every node. This will however only repair one range on each node, so for rebuilding a failed node, you'll want to stick to nodetool repair on the node to recover. But then it's expected to get RF repair sessions on said node. -- Sylvain Nuno -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: terça-feira, 3 de Abril de 2012 12:38 To: user@cassandra.apache.org Subject: Re: Repair in loop? Importance: Low On Tue, Apr 3, 2012 at 12:52 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Thank you for your response. My question is that it is repeating the same column family: INFO 19:12:24,656 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) [...] INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) What I was showing in my previous email is the point where it restarted: Ok, then it's likely because because those correspond to different ranges of the ring. Unless you've started the repair with nodetool repair -pr, the repair will try to repair every range of the node and each repair will a different repair session. I'll admit though that printing which range is being repaired would have avoid that confusion. -- Sylvain INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) Notice the 1 remaining column family to sync for this session indication changes to 255 remaining column family to sync for this session. Regards, Nuno Jordão -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: terça-feira, 3 de Abril de 2012 11:36 To: user@cassandra.apache.org Subject: Re: Repair in loop? Importance: Low It just means that you have lots of column family and repair does 1 column family at a time. Each line is just saying it's done with one of the column family. There is nothing wrong, but it does mean the repair is *not* done yet. -- Sylvain On Tue, Apr 3, 2012 at 12:28 PM, Nuno Jordao nuno-m-jor...@telecom.pt wrote: Hello, I'm doing some test with cassandra 1.0.8 using multiple data directories with individual disks in a three node cluster (replica=3). One of the tests was to replace a couple of disks and start a repair process. It started ok and refilled the disks but I noticed that after the recovery process finished, it started a new one again: INFO 09:34:42,481 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_6f is fully synced (6 remaining column family to sync for this session) INFO 09:41:55,288 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_0d is fully synced (5 remaining column family to sync for this session) INFO 09:42:50,169 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_07 is fully synced (4 remaining column family to sync for this session) INFO 09:45:02,743 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_5a is fully synced (3 remaining column family to sync for this session) INFO 09:48:03,010 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_da is fully synced (2 remaining column family to sync for this session) INFO 09:54:51,112 [repair #69c95b50-7cee-11e1--6b5cbd036faf] BlockData_e8 is fully synced (1 remaining column family to sync for this session) INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_b6 is fully synced (255 remaining column family to sync for this session) INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_13 is fully synced (254 remaining column family to sync for this session) INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_8b is fully synced (253 remaining column family to sync for this session) INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_31 is fully synced (252 remaining column family to sync for this session) INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_0c is fully synced (251 remaining column family to sync for this session) INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1--6b5cbd036faf] BlockData_1b is fully synced (250 remaining column family to sync for this session) Is this normal? To me it doesn't make much sense. Regards, Nuno
Re: Counter Column
Sylvain explained a lot of things about counters at Cassandra SF 2011 : http://blip.tv/datastax/counters-in-cassandra-5497678 (video), http://www.datastax.com/wp-content/uploads/2011/07/cassandra_sf_counters.pdf(slides). I think it is always important knowing how the things work. Alain 2012/4/3 Sylvain Lebresne sylv...@datastax.com Again, it will be relevant until CASSANDRA-2495 is fixed. Until then (then being undefined so far), it affects all version that have counters (including 1.0.8). -- Sylvain On Tue, Apr 3, 2012 at 12:23 PM, Avi-h avih...@gmail.com wrote: this bug is for 0.8 beta 1, is it also relevant for 1.0.8? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Counter-Column-tp7432010p7432450.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Write performance compared to Postgresql
Ok, so you think the write speed is limited by the client and protocol, rather than the cassandra backend? This sounds reasonable, and fits with our use case, as we will have several servers writing. However, a bit harder to test! Jeff On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote: Hi Jeff, Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve. Jake On Apr 3, 2012, at 7:08 AM, Jeff Williams je...@wherethebitsroam.com wrote: Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff cassandra-bm.rb
Re: Write performance compared to Postgresql
Note that having tons of TCP connections is not good. We are using async client to issue multiple calls over single connection at same time. You can do the same. Best regards, Vitalii Tymchyshyn. 03.04.12 16:18, Jeff Williams написав(ла): Ok, so you think the write speed is limited by the client and protocol, rather than the cassandra backend? This sounds reasonable, and fits with our use case, as we will have several servers writing. However, a bit harder to test! Jeff On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote: Hi Jeff, Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve. Jake On Apr 3, 2012, at 7:08 AM, Jeff Williamsje...@wherethebitsroam.com wrote: Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff cassandra-bm.rb
Re: Compression on client side vs server side
We are using client-side compression because of next points. Can you confirm they are valid? 1) Server-side compression uses replication factor more CPU (3 times more with replication factor of 3). 2) Network is used more by compression factor (as you are sending uncompressed data over the wire). 4) Any server utility operations, like repair or move (not sure for the latter) will decompress/compress So, client side decompression looks way cheapier and can be very efficient for long columns. Best regards, Vitalii Tymchyshyn 2012/4/2 Jeremiah Jordan jeremiah.jor...@morningstar.com The server side compression can compress across columns/rows so it will most likely be more efficient. Whether you are CPU bound or IO bound depends on your application and node setup. Unless your working set fits in memory you will be IO bound, and in that case server side compression helps because there is less to read from disk. In many cases it is actually faster to read a compressed file from disk and decompress it, then to read an uncompressed file from disk. See Ed's post: Cassandra compression is like more servers for free! http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/cassandra_compression_is_like_getting -- *From:* benjamin.j.mcc...@gmail.com [benjamin.j.mcc...@gmail.com] on behalf of Ben McCann [b...@benmccann.com] *Sent:* Monday, April 02, 2012 10:42 AM *To:* user@cassandra.apache.org *Subject:* Compression on client side vs server side Hi, I was curious if I compress my data on the client side with Snappy whether there's any difference between doing that and doing it on the server side? The wiki said that compression works best where each row has the same columns. Does this mean the compression will be more efficient on the server side since it can look at multiple rows at once instead of only the row being inserted? The reason I was thinking about possibly doing it client side was that it would save CPU on the datastore machine. However, does this matter? Is CPU typically the bottleneck on a machine or is it some other resource? (of course this will vary for each person, but wondering if there's a rule of thumb. I'm making a web app, which hopefully will store about 5TB of data and have 10s of millions of page views per month) Thanks, Ben -- Best regards, Vitalii Tymchyshyn
Re: Write performance compared to Postgresql
Vitalii, Yep, that sounds like a good idea. Do you have any more information about how you're doing that? Which client? Because even with 3 concurrent client nodes, my single postgresql server is still out performing my 2 node cassandra cluster, although the gap is narrowing. Jeff On Apr 3, 2012, at 4:08 PM, Vitalii Tymchyshyn wrote: Note that having tons of TCP connections is not good. We are using async client to issue multiple calls over single connection at same time. You can do the same. Best regards, Vitalii Tymchyshyn. 03.04.12 16:18, Jeff Williams написав(ла): Ok, so you think the write speed is limited by the client and protocol, rather than the cassandra backend? This sounds reasonable, and fits with our use case, as we will have several servers writing. However, a bit harder to test! Jeff On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote: Hi Jeff, Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve. Jake On Apr 3, 2012, at 7:08 AM, Jeff Williamsje...@wherethebitsroam.com wrote: Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff cassandra-bm.rb
RE: Write performance compared to Postgresql
So Cassandra may or may not be faster than your current system when you have a couple connections. Where it is faster, and scales, is when you get hundreds of clients across many nodes. See: http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html With 60 clients running 200 threads each they were able to get 10K writes per second per server, and as you added servers from 48-288 you still got 10K writes per second, so the aggregate writes per second went from 48*10K to 288*10K -Jeremiah From: Jeff Williams [je...@wherethebitsroam.com] Sent: Tuesday, April 03, 2012 10:09 AM To: user@cassandra.apache.org Subject: Re: Write performance compared to Postgresql Vitalii, Yep, that sounds like a good idea. Do you have any more information about how you're doing that? Which client? Because even with 3 concurrent client nodes, my single postgresql server is still out performing my 2 node cassandra cluster, although the gap is narrowing. Jeff On Apr 3, 2012, at 4:08 PM, Vitalii Tymchyshyn wrote: Note that having tons of TCP connections is not good. We are using async client to issue multiple calls over single connection at same time. You can do the same. Best regards, Vitalii Tymchyshyn. 03.04.12 16:18, Jeff Williams написав(ла): Ok, so you think the write speed is limited by the client and protocol, rather than the cassandra backend? This sounds reasonable, and fits with our use case, as we will have several servers writing. However, a bit harder to test! Jeff On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote: Hi Jeff, Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve. Jake On Apr 3, 2012, at 7:08 AM, Jeff Williamsje...@wherethebitsroam.com wrote: Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff cassandra-bm.rb
AUTO: Manoj Chaudhary: My computer is in for repair (returning 04/10/2012)
I am out of the office until 04/10/2012. I am out of Office from 04/02/2012 to 04/10/2012. I am travelling for Customer and Partner visit in Japan. I will try to respond to email between meetings if possible. For anything urgent please contact Rishi Vaish (Rishi Vaish/San Jose/IBM) Note: This is an automated response to your message Repair in loop? sent on 04/03/2012 4:28:35. This is the only notification you will receive while this person is away.
2 questions DataStax Enterprise
Hi guys, I'm trying out DSE and looking for the best way to arrange the cluster. I have 9 nodes: 3 behind a gateway taking in writes from my collectors and 6 outside the gateway that are supposed to take replicas from the other 3 and serve reads and analytics jobs. 1. Is it ok to run the 3 nodes as normal Cassandra nodes and run the other 6 nodes as analytics? Can I serve both real time reads and M/R jobs from the 6 nodes? How will these affect each other performancewise? I know that the way the system is supposed to be used is to separate analytics from real time queries. I've already explored a possible 3DC setup with Tyler in another message and it indeed works but I'm afraid it is too complex and would require me to send 2 replicas across the firewall which it can't handle very well at peak times, affecting other applications. 2. I started the cluster in the setup described in 1 (3 normal, 6 analytics) and as soon as the Analytics nodes start up they start outputting this message: INFO [TASK-TRACKER-INIT] 2012-04-03 17:54:59,575 Client.java (line 629) Retrying connect to server: IP_OF_NORMAL_CASSANDRA_SEED_NODE:8012. Already tried 10 time(s). So it seems my analytics nodes are trying to contact the normal Cassandra seed node on port 8012 which I read is a Hadoop Job Tracker client port. It doesn't seem like this is the normal behavior. Why is it getting confused? In the .yaml of each node I'm using endpoint_snitch: com.datastax.bdp.snitch.DseSimpleSnitch and putting in the Analytics seed node before the normal cassandra seed node in the seeds. Cheers, Alex
RE: Write performance compared to Postgresql
Where is your client running? -Original Message- From: Jeff Williams [mailto:je...@wherethebitsroam.com] Sent: Tuesday, April 03, 2012 11:09 AM To: user@cassandra.apache.org Subject: Re: Write performance compared to Postgresql Vitalii, Yep, that sounds like a good idea. Do you have any more information about how you're doing that? Which client? Because even with 3 concurrent client nodes, my single postgresql server is still out performing my 2 node cassandra cluster, although the gap is narrowing. Jeff On Apr 3, 2012, at 4:08 PM, Vitalii Tymchyshyn wrote: Note that having tons of TCP connections is not good. We are using async client to issue multiple calls over single connection at same time. You can do the same. Best regards, Vitalii Tymchyshyn. 03.04.12 16:18, Jeff Williams написав(ла): Ok, so you think the write speed is limited by the client and protocol, rather than the cassandra backend? This sounds reasonable, and fits with our use case, as we will have several servers writing. However, a bit harder to test! Jeff On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote: Hi Jeff, Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve. Jake On Apr 3, 2012, at 7:08 AM, Jeff Williamsje...@wherethebitsroam.com wrote: Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff cassandra-bm.rb
Re: 2 questions DataStax Enterprise
Hi reply inline. On Tue, Apr 3, 2012 at 12:18 PM, Alexandru Sicoe adsi...@gmail.com wrote: Hi guys, I'm trying out DSE and looking for the best way to arrange the cluster. I have 9 nodes: 3 behind a gateway taking in writes from my collectors and 6 outside the gateway that are supposed to take replicas from the other 3 and serve reads and analytics jobs. 1. Is it ok to run the 3 nodes as normal Cassandra nodes and run the other 6 nodes as analytics? Can I serve both real time reads and M/R jobs from the 6 nodes? How will these affect each other performancewise? if you plan to use CFS heavily then it will affect performance of the other nodes. If you raise the RF of your column families then it should be fine if you run mapreduce at CL=ONE I know that the way the system is supposed to be used is to separate analytics from real time queries. I've already explored a possible 3DC setup with Tyler in another message and it indeed works but I'm afraid it is too complex and would require me to send 2 replicas across the firewall which it can't handle very well at peak times, affecting other applications. 2. I started the cluster in the setup described in 1 (3 normal, 6 analytics) and as soon as the Analytics nodes start up they start outputting this message: INFO [TASK-TRACKER-INIT] 2012-04-03 17:54:59,575 Client.java (line 629) Retrying connect to server: IP_OF_NORMAL_CASSANDRA_SEED_NODE:8012. Already tried 10 time(s). So it seems my analytics nodes are trying to contact the normal Cassandra seed node on port 8012 which I read is a Hadoop Job Tracker client port. It doesn't seem like this is the normal behavior. Why is it getting confused? In the .yaml of each node I'm using endpoint_snitch: com.datastax.bdp.snitch.DseSimpleSnitch and putting in the Analytics seed node before the normal cassandra seed node in the seeds. You can run dsetool movejt to move the jobtracker to one of the known hadoop nodes. Cheers, Alex -- http://twitter.com/tjake
Re: composite query performance depends on component ordering
Hi Sylvain and Aaron, Thanks for the comment Sylvain, what you say makes sense, I have microsecond precision timestamps and looking at some row printouts I see everything is happening at a different timestamp which means that it won't compare the second 100 bytes component. As for the methodology it's not so thorough. I used Cassandra 0.8.5. What I did is I had acquired a large data set about 300 hrs worth of data in Schema 1 (details below) which I found was easily hitting thousands of rows for some queries, thus giving me very poor performance. I converted this data to Schema 2 (details below) thus grouping the data together in the same row and increasing the time bucket for the row (with two versions Timestamp:ID and ID:Timestamp for the column names). So I obtained a CF with 66 rows, 11 rows for 3 different types of data sources which are dominant in the rates of info they give me (each row is a 24 hr time bucket). These are the results I got using the CompositeQueryIterator (with a modified max of 100.000 cols returned per slice) taken from the Composite query tutorial at http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1(code is at https://github.com/zznate/cassandra-tutorial). So basically I used null for start and end in order to read entire rows at a time. I timed my code. The actual values are doubles for all 3 types. The size is the file size after dumping the results to a text file. Ok, in my previous email I just looked at the rows with the max size which gave me a 20% difference. In earnest it's less. Type1 ID:Timestamp Timestamp:ID No. Cols returned Size of file ExecTime (sec) ExecTime (sec) ExecTime Diff % 387174 25M 12.59 8.6 31.68 1005113 66M 31.83 21.84 31.38 579633 38M 18.07 12.46 31.03 1217634 81M 33.77 24.65 26.99 376303 24M 12.32 10.36 15.94 2493007 169M 68.68 59.93 12.74 6298275 428M 183.28 147.57 19.48 2777962 189M 83.16 73.3 11.86 6138047 416M 170.88 155.83 8.81 3193450 216M 93.26 82.84 11.18 2302928 155M 69.91 61.62 11.85 Avg 19.3 % Type 2 ID:Timestamp Timestamp:ID No Cols returned Size of file ExecTime (sec) ExecTime (sec) ExecTime Diff % 350468 40M 12.92 13.12 -1.59 1303797 148M 43.33 38.98 10.04 697763 79M 26.78 22.05 17.66 825414 94M 33.5 26.69 20.31 55075 6.2M 2.97 2.13 28.15 1873775 213M 72.37 51.12 29.37 3982433 453M 147.04 110.71 24.71 1546491 176M 54.86 42.13 23.21 4117491 468M 143.1 114.62 19.9 1747506 199M 63.23 63.05 0.28 2720160 308M 96.06 82.47 14.14 Avg = 16.9 % Type 3 ID:Timestamp Timestamp:ID No Cols returned Size of file ExecTime (sec) ExecTime (sec) ExecTime Diff % 192667 7.2M 5.88 6.5 -10.49 210593 7.9M 6.33 5.57 12.06 144677 5.4M 3.78 3.74 1.22 207706 7.7M 6.33 5.74 9.28 235937 8.7M 6.34 6.11 3.64 159985 6.0M 4.23 3.93 7.07 134859 5.5M 3.91 3.38 13.46 70545 2.9M 2.96 2.08 29.84 98487 3.9M 4.04 2.62 35.22 205979 8.2M 7.35 5.67 22.87 166045 6.2M 5.12 3.99 22.1 Avg = 13.3 % Just to understand why I did the tests. Data set: I have ~300.000 data sources. Each data source has several variables it can output values for. There are ~12 variables / data source. This gives ~4 million independent time series (let's call them streams) that need to go into Cassandra. The streams give me (timestamp,value) pairs at higly different rates, depending on the data source it comes from and operating conditions. This translates into very different row lengths if a unique time bucket is used across all streams. The data sources can be further grouped in types (several data sources can share the same type). There are ~100 types. Use case: The system - will serve a web dashboard. - should allow queries at highest granularity for short periods of time (up to between 4-8hrs) on any individual stream or grouping of streams - should allow a method of obtaining on demand (offline) analytics over long periods of time (up to 1 year) and then (real-time) querying on the analytics data Cassandra schemes used so far: Schema 1: 1 row for each of the 3 million streams. Each row is a 4hr time bucket. Schema 2: 1 row for each of the 100 types. Each row is an 24hr time bucket. Now I'm planning to use Schema 2 only with an 8hr time bucket to better reconcile between rows that get very long and ones that don't. Cheers, Alex On Sat, Mar 31, 2012 at 9:35 PM, aaron morton aa...@thelastpickle.comwrote: Can you post the details of the queries you are running, including the methodology of the tests ? (Here is the methodology I used to time queries previously http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/) Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/03/2012, at 1:29 AM, Alexandru Sicoe wrote: Hi guys, I am consistently seeing a 20% improvement in query retrieval times if I use the composite comparator Timestamp:ID instead of ID:Timestamp where Timestamp=Long and ID=~100 character strings. I am retrieving all columns
Re: cassandra 1.08 on java7 and win7
puneet loya puneetloya at gmail.com writes: create keyspace DEMO with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options=[{datacenter1:1}]; try it n check if it executes Hi Puneet, I have same issue. Running the command you mentioned below works for me. What is the issue here? Gopala
RE: Counter Column
Right, it affects every version of Cassandra from 0.8 beta 1 until the Fix Version, which right now is None, so it isn't fixed yet... From: Avi-h [avih...@gmail.com] Sent: Tuesday, April 03, 2012 5:23 AM To: cassandra-u...@incubator.apache.org Subject: Re: Counter Column this bug is for 0.8 beta 1, is it also relevant for 1.0.8? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Counter-Column-tp7432010p7432450.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Write performance compared to Postgresql
Hello. We are using java async thrift client. As of ruby, it seems you need to use something like http://www.mikeperham.com/2010/02/09/cassandra-and-eventmachine/ (Not sure as I know nothing about ruby). Best regards, Vitalii Tymchyshyn 2012/4/3 Jeff Williams je...@wherethebitsroam.com Vitalii, Yep, that sounds like a good idea. Do you have any more information about how you're doing that? Which client? Because even with 3 concurrent client nodes, my single postgresql server is still out performing my 2 node cassandra cluster, although the gap is narrowing. Jeff On Apr 3, 2012, at 4:08 PM, Vitalii Tymchyshyn wrote: Note that having tons of TCP connections is not good. We are using async client to issue multiple calls over single connection at same time. You can do the same. Best regards, Vitalii Tymchyshyn. 03.04.12 16:18, Jeff Williams написав(ла): Ok, so you think the write speed is limited by the client and protocol, rather than the cassandra backend? This sounds reasonable, and fits with our use case, as we will have several servers writing. However, a bit harder to test! Jeff On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote: Hi Jeff, Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve. Jake On Apr 3, 2012, at 7:08 AM, Jeff Williamsje...@wherethebitsroam.com wrote: Hi, I am looking at cassandra for a logging application. We currently log to a Postgresql database. I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected? jeff@transcoder01:~$ ruby cassandra-bm.rb cassandra 3.17 0.48 3.65 ( 12.032212) jeff@transcoder01:~$ ruby cassandra-bm.rb postgres 2.14 0.33 2.47 ( 7.002601) Regards, Jeff cassandra-bm.rb -- Best regards, Vitalii Tymchyshyn
size tiered compaction - improvement
there is problem with size tiered compaction design. It compacts together tables of similar size. sometimes it might happen that you will have some sstables sitting on disk forever (Feb 23) because no other similar sized tables were created and probably never be. because flushed sstable is about 11-16 mb. next level about 90 MB then 5x 90 MB gets compacted to 400 MB sstable and 5x400 MB ~ 2 GB problem is that 400 MB sstable is too small to be compacted against these 3x 720 MB ones. -rw-r--r-- 1 root wheel 165M Feb 23 17:03 resultcache-hc-13086-Data.db -rw-r--r-- 1 root wheel 772M Feb 23 17:04 resultcache-hc-13087-Data.db -rw-r--r-- 1 root wheel 156M Feb 23 17:06 resultcache-hc-13091-Data.db -rw-r--r-- 1 root wheel 716M Feb 23 17:18 resultcache-hc-13096-Data.db -rw-r--r-- 1 root wheel 734M Feb 23 17:29 resultcache-hc-13101-Data.db -rw-r--r-- 1 root wheel 5.0G Mar 14 09:38 resultcache-hc-13923-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 16 22:41 resultcache-hc-14084-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 21 15:11 resultcache-hc-14460-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 27 05:22 resultcache-hc-14694-Data.db -rw-r--r-- 1 root wheel 2.0G Mar 31 04:57 resultcache-hc-14851-Data.db -rw-r--r-- 1 root wheel 112M Mar 31 06:30 resultcache-hc-14922-Data.db -rw-r--r-- 1 root wheel 577M Apr 1 19:25 resultcache-hc-14943-Data.db compaction strategy needs to compact sstables by timestamp too. older tables should have increased chance to get compacted. for example - table from today will be compacted with other table in range (0.5-1.5) of its size, and this range will get increased with sstable age. - 1 month old will have range for example (0.2 - 1.8).
Re: size tiered compaction - improvement
if you know for sure that you will free lot of space compacting some old table, then you can call UserdefinedCompaction for this table(you can do this from cron). There is also a ticket in jira with discussion on per-sstable expierd column and tombstones counters. -Original Message- From: Radim Kolar h...@filez.com To: user@cassandra.apache.org Sent: Tue, 03 Apr 2012 22:53 Subject: size tiered compaction - improvement there is problem with size tiered compaction design. It compacts together tables of similar size. sometimes it might happen that you will have some sstables sitting on disk forever (Feb 23) because no other similar sized tables were created and probably never be. because flushed sstable is about 11-16 mb. next level about 90 MB then 5x 90 MB gets compacted to 400 MB sstable and 5x400 MB ~ 2 GB problem is that 400 MB sstable is too small to be compacted against these 3x 720 MB ones. -rw-r--r-- 1 root wheel 165M Feb 23 17:03 resultcache-hc-13086-Data.db -rw-r--r-- 1 root wheel 772M Feb 23 17:04 resultcache-hc-13087-Data.db -rw-r--r-- 1 root wheel 156M Feb 23 17:06 resultcache-hc-13091-Data.db -rw-r--r-- 1 root wheel 716M Feb 23 17:18 resultcache-hc-13096-Data.db -rw-r--r-- 1 root wheel 734M Feb 23 17:29 resultcache-hc-13101-Data.db -rw-r--r-- 1 root wheel 5.0G Mar 14 09:38 resultcache-hc-13923-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 16 22:41 resultcache-hc-14084-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 21 15:11 resultcache-hc-14460-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 27 05:22 resultcache-hc-14694-Data.db -rw-r--r-- 1 root wheel 2.0G Mar 31 04:57 resultcache-hc-14851-Data.db -rw-r--r-- 1 root wheel 112M Mar 31 06:30 resultcache-hc-14922-Data.db -rw-r--r-- 1 root wheel 577M Apr 1 19:25 resultcache-hc-14943-Data.db compaction strategy needs to compact sstables by timestamp too. older tables should have increased chance to get compacted. for example - table from today will be compacted with other table in range (0.5-1.5) of its size, and this range will get increased with sstable age. - 1 month old will have range for example (0.2 - 1.8).
Re: key cache size calculation
It depends on the workload. Increase the cache size until you see the hit rate decrease, or see it create memory pressure. Watch the logs for messages that the caches have been decreased. Take a look at the Recent Read Latency for the CF. This is how long it takes to actually read data on that node. You can then work out the throughput taking into account the concurrent_readers setting in the yaml. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 3/04/2012, at 4:14 PM, Shoaib Mir wrote: On Tue, Apr 3, 2012 at 11:49 AM, aaron morton aa...@thelastpickle.com wrote: Take a look at the key cache hit rate in nodetool cfstats. One approach is to increase the cache size until you do not see a matching increase in the hit rate. Thanks Aaron, what do you think will be the ideal cache hit ratio where we want this particular DB server to do around 5-6K responses per second? right now it is doing just 2-3K per second and the cache hit ratio I can see with cfstats is around the 85-90%. Do you think having a higher cache hit ratio around the 95% mark will help with getting a high throughput as well? cheers, Shoaib
Re: Error Replicate on write
What is logged when it cannot find JNA ? What is passed to the java service ? Check with ps aux | grep cassandra Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 3/04/2012, at 7:28 PM, Carlos Juzarte Rolo wrote: It is, but it doesn't load it. I tried the default package manager version (3.2), the 3.3 and the 3.4 version and this node always say that was unable to load the JNA. I put the jna.jar inside /.../cassandra/lib/ where the other .jar files are. I have other nodes with the same config (without JNA) and they don't trigger this error. On 04/02/2012 11:26 PM, aaron morton wrote: Is JNA.jar in the path ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 2/04/2012, at 10:11 PM, Carlos Juzarte Rolo wrote: Hi, I've been using cassandra for a while, but after a upgrade to 1.0.7, every machine kept running perfectly. Well, except one that constantly throws this error: ERROR [ReplicateOnWriteStage:39] 2012-04-02 12:02:55,131 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReplicateOnWriteStage:39,5,main] java.lang.NoClassDefFoundError: Could not initialize class org.apache.cassandra.cache.FreeableMemory at org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:92) at org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:154) at org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:63) at org.apache.cassandra.db.ColumnFamilyStore.cacheRow(ColumnFamilyStore.java:1170) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1194) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1151) at org.apache.cassandra.db.Table.getRow(Table.java:375) at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58) at org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:99) at org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:544) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1223) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) The machine does run, but nodetool doesn't connect to the machine for example. Any ideia from what could be trigerring this? Thanks,
Re: size tiered compaction - improvement
Twitter tried a timestamp-based compaction strategy in https://issues.apache.org/jira/browse/CASSANDRA-2735. The conclusion was, this actually resulted in a lot more compactions than the SizeTieredCompactionStrategy. The increase in IO was not acceptable for our use and therefore stopped working on this patch. 2012/4/3 Radim Kolar h...@filez.com: there is problem with size tiered compaction design. It compacts together tables of similar size. sometimes it might happen that you will have some sstables sitting on disk forever (Feb 23) because no other similar sized tables were created and probably never be. because flushed sstable is about 11-16 mb. next level about 90 MB then 5x 90 MB gets compacted to 400 MB sstable and 5x400 MB ~ 2 GB problem is that 400 MB sstable is too small to be compacted against these 3x 720 MB ones. -rw-r--r-- 1 root wheel 165M Feb 23 17:03 resultcache-hc-13086-Data.db -rw-r--r-- 1 root wheel 772M Feb 23 17:04 resultcache-hc-13087-Data.db -rw-r--r-- 1 root wheel 156M Feb 23 17:06 resultcache-hc-13091-Data.db -rw-r--r-- 1 root wheel 716M Feb 23 17:18 resultcache-hc-13096-Data.db -rw-r--r-- 1 root wheel 734M Feb 23 17:29 resultcache-hc-13101-Data.db -rw-r--r-- 1 root wheel 5.0G Mar 14 09:38 resultcache-hc-13923-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 16 22:41 resultcache-hc-14084-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 21 15:11 resultcache-hc-14460-Data.db -rw-r--r-- 1 root wheel 1.9G Mar 27 05:22 resultcache-hc-14694-Data.db -rw-r--r-- 1 root wheel 2.0G Mar 31 04:57 resultcache-hc-14851-Data.db -rw-r--r-- 1 root wheel 112M Mar 31 06:30 resultcache-hc-14922-Data.db -rw-r--r-- 1 root wheel 577M Apr 1 19:25 resultcache-hc-14943-Data.db compaction strategy needs to compact sstables by timestamp too. older tables should have increased chance to get compacted. for example - table from today will be compacted with other table in range (0.5-1.5) of its size, and this range will get increased with sstable age. - 1 month old will have range for example (0.2 - 1.8). -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Largest 'sensible' value
We use 2MB chunks for our CFS implementation of HDFS: http://www.datastax.com/dev/blog/cassandra-file-system-design On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter franc.car...@sirca.org.au wrote: Hi, We are in the early stages of thinking about a project that needs to store data that will be accessed by Hadoop. One of the concerns we have is around the Latency of HDFS as our use case is is not for reading all the data and hence we will need custom RecordReaders etc. I've seen a couple of comments that you shouldn't put large chunks in to a value - however 'large' is not well defined for the range of people using these solutions ;-) Doe anyone have a rough rule of thumb for how big a single value can be before we are outside sanity? thanks -- Franc Carter | Systems architect | Sirca Ltd franc.car...@sirca.org.au | www.sirca.org.au Tel: +61 2 9236 9118 Level 9, 80 Clarence St, Sydney NSW 2000 PO Box H58, Australia Square, Sydney NSW 1215 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: column’s timestamp
That would work, with the caveat that you'd have to delete it and re-insert if you want to preserve that relationship on update. On Mon, Apr 2, 2012 at 12:18 PM, Pierre Chalamet pie...@chalamet.net wrote: Hi, What about using a ts as column name and do a get sliced instead ? --Original Message-- From: Avi-h To: cassandra-u...@incubator.apache.org ReplyTo: user@cassandra.apache.org Subject: column’s timestamp Sent: Apr 2, 2012 18:24 Is it possible to fetch a column based on the row key and the column’s timestamp only (not using the column’s name)? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/column-s-timestamp-tp7429905p7429905.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com. - Pierre -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: really bad select performance
Secondary indexes can generate a lot of random i/o. iostat -x can confirm if that's your problem. On Thu, Mar 29, 2012 at 5:52 PM, Chris Hart ch...@remilon.com wrote: Hi, I have the following cluster: 136112946768375385385349842972707284580 ip address MountainViewRAC1 Up Normal 1.86 GB 20.00% 0 ip address MountainViewRAC1 Up Normal 2.17 GB 33.33% 56713727820156410577229101238628035242 ip address MountainViewRAC1 Up Normal 2.41 GB 33.33% 113427455640312821154458202477256070485 ip address Rackspace RAC1 Up Normal 3.9 GB 13.33% 136112946768375385385349842972707284580 The following query runs quickly on all nodes except 1 MountainView node: select * from Access_Log where row_loaded = 0 limit 1; There is a secondary index on row_loaded. The query usually doesn't complete (but sometimes does) on the bad node and returns very quickly on all other nodes. I've upping the rpc timeout to a full minute (rpc_timeout_in_ms: 6) in the yaml, but it still often doesn't complete in a minute. It seems just as likely to complete and takes about the same amount of time whether the limit is 1, 100 or 1000. Thanks for any help, Chris -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: key cache size calculation
On Wed, Apr 4, 2012 at 8:04 AM, aaron morton aa...@thelastpickle.comwrote: It depends on the workload. Increase the cache size until you see the hit rate decrease, or see it create memory pressure. Watch the logs for messages that the caches have been decreased. Take a look at the Recent Read Latency for the CF. This is how long it takes to actually read data on that node. You can then work out the throughput taking into account the concurrent_readers setting in the yaml. Thanks Aaron, I will try this. cheers, Shoaib
Re: tombstones problem with 1.0.8
Removing expired columns actually requires two compaction passes: one to turn the expired column into a tombstone; one to remove the tombstone after gc_grace_seconds. (See https://issues.apache.org/jira/browse/CASSANDRA-1537.) Perhaps CASSANDRA-2786 was causing things to (erroneously) be cleaned up early enough that this helped you out in 0.8.2? On Wed, Mar 21, 2012 at 8:38 PM, Ross Black ross.w.bl...@gmail.com wrote: Hi, We recently moved from 0.8.2 to 1.0.8 and the behaviour seems to have changed so that tombstones are now not being deleted. Our application continually adds and removes columns from Cassandra. We have set a short gc_grace time (3600) since our application would automatically delete zombies if they appear. Under 0.8.2, the tombstones remained at a relatively constant number. Under 1.0.8, the tombstones have been continually increasing so that they exceed the size of our real data (at this stage we have over 100G of tombstones). Even after running a full compact the new compacted SSTable contains a massive number of tombstones, many that are several weeks old. Have I missed some new configuration option to allow deletion of tombstones? I also noticed that one of the changes between 0.8.2 and 1.0.8 was https://issues.apache.org/jira/browse/CASSANDRA-2786 which changed code to avoid dropping tombstones when they might still be needed to shadow data in another sstable. Could this be having an impact since we continually add and remove columns even while a major compact is executing? Thanks, Ross -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: size tiered compaction - improvement
Dne 3.4.2012 23:04, i...@4friends.od.ua napsal(a): if you know for sure that you will free lot of space compacting some old table, then you can call UserdefinedCompaction for this table(you can do this from cron). There is also a ticket in jira with discussion on per-sstable expierd column and tombstones counters. you are talking about CompactionManager,forceUserDefinedCompaction mbean? it takes 2 argumenents, no description on them. i never got this work. NoSuchElementException returned
System keyspace leak?
I've been trying to understand the overhead of create/drop keyspace on Cassandra 1.0.8. It's not free, especially when I've managed to drive up the LiveDiskSpaceUsed for the Migrations CF in the system keyspace up to over 12 MB of disk. I've tried doing nodetool -h localhost repair system and other nodetool commands to try to compact the SSTables involved with it, but it never wants to let go of that slowly growing space. The Cassandra node in question is in a ring of size 1. Other than clobbering my data directory, how do I get my space back? Is it natural for this to grow seemingly infinitely (even though it's pretty small increments), or did I find a bug? The reason I ask is I need to run unit tests in a shared developer infrastructure with Cassandra, and we were having a little trouble with TRUNCATE on column families, but that might have been environmental (I've not looked deeply into it). Which is less expensive? Create/Drop, or Truncate? I don't expect Truncate to swell the Migration Column Family because it tracks (seemingly) schema changes. Dave
Re: Largest 'sensible' value
On Wed, Apr 4, 2012 at 8:56 AM, Jonathan Ellis jbel...@gmail.com wrote: We use 2MB chunks for our CFS implementation of HDFS: http://www.datastax.com/dev/blog/cassandra-file-system-design thanks On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter franc.car...@sirca.org.au wrote: Hi, We are in the early stages of thinking about a project that needs to store data that will be accessed by Hadoop. One of the concerns we have is around the Latency of HDFS as our use case is is not for reading all the data and hence we will need custom RecordReaders etc. I've seen a couple of comments that you shouldn't put large chunks in to a value - however 'large' is not well defined for the range of people using these solutions ;-) Doe anyone have a rough rule of thumb for how big a single value can be before we are outside sanity? thanks -- Franc Carter | Systems architect | Sirca Ltd franc.car...@sirca.org.au | www.sirca.org.au Tel: +61 2 9236 9118 Level 9, 80 Clarence St, Sydney NSW 2000 PO Box H58, Australia Square, Sydney NSW 1215 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- *Franc Carter* | Systems architect | Sirca Ltd marc.zianideferra...@sirca.org.au franc.car...@sirca.org.au | www.sirca.org.au Tel: +61 2 9236 9118 Level 9, 80 Clarence St, Sydney NSW 2000 PO Box H58, Australia Square, Sydney NSW 1215
Re: System keyspace leak?
Well I just found this: http://wiki.apache.org/cassandra/LiveSchemaUpdates which explains a ton... It looks like this particular Column Family will grow infinitely (it's just one row with a column per migration), so if I'm pounding on my Cassandra node with CREATE/DROP activity, I'm going to make VERY wide row. That tells me enough to say don't do that! :-) Dave On Tue, Apr 3, 2012 at 6:00 PM, David Leimbach leim...@gmail.com wrote: I've been trying to understand the overhead of create/drop keyspace on Cassandra 1.0.8. It's not free, especially when I've managed to drive up the LiveDiskSpaceUsed for the Migrations CF in the system keyspace up to over 12 MB of disk. I've tried doing nodetool -h localhost repair system and other nodetool commands to try to compact the SSTables involved with it, but it never wants to let go of that slowly growing space. The Cassandra node in question is in a ring of size 1. Other than clobbering my data directory, how do I get my space back? Is it natural for this to grow seemingly infinitely (even though it's pretty small increments), or did I find a bug? The reason I ask is I need to run unit tests in a shared developer infrastructure with Cassandra, and we were having a little trouble with TRUNCATE on column families, but that might have been environmental (I've not looked deeply into it). Which is less expensive? Create/Drop, or Truncate? I don't expect Truncate to swell the Migration Column Family because it tracks (seemingly) schema changes. Dave
Re: size tiered compaction - improvement
The first is keyspace name, second is sstable name (like transaction-hc-1024-Data.db -Original Message- From: Radim Kolar h...@filez.com To: user@cassandra.apache.org Sent: Wed, 04 Apr 2012 3:14 Subject: Re: size tiered compaction - improvement Dne 3.4.2012 23:04, i...@4friends.od.ua napsal(a): if you know for sure that you will free lot of space compacting some old table, then you can call UserdefinedCompaction for this table(you can do this from cron). There is also a ticket in jira with discussion on per-sstable expierd column and tombstones counters. you are talking about CompactionManager,forceUserDefinedCompaction mbean? it takes 2 argumenents, no description on them. i never got this work. NoSuchElementException returned
Re: size tiered compaction - improvement
Here is small python script I run once per day. You have to adjust size and/or age limits in the 'if' operator. Also I use mx4j interface for jmx calls. #!/usr/bin/env python import sys,os,glob,time,urllib2 CASSANDRA_DATA='/spool1/cassandra/data' DONTTOUCH=('system',) now = time.time() def main(): kss=[ks for ks in os.listdir(CASSANDRA_DATA) if ks not in DONTTOUCH] for ks in kss: sstables=[sst for sst in glob.glob(CASSANDRA_DATA+'/'+ks+'/'+'*-Data.db') if sst.find('-tmp-')==-1] for table in sstables: st = os.stat(table) age=(now-st.st_mtime)/24/3600 size=st.st_size/1024/1024/1024 if (age = 5 and size = 5) or age = 10: table_name = table.split('/')[-1] print compacting , ks, table_name url='http://localhost:8081/invoke?operation=forceUserDefinedCompactionobjectname=org.apache.cassandra.db%%3Atype%%3DCompactionManagervalue0=%stype0=java.lang.Stringvalue1=%stype1=java.lang.String'%(ks, table_name) r=urllib2.urlopen(url) time.sleep(1) if __name__=='__main__': main() On 04/04/2012 07:47 AM, i...@4friends.od.ua wrote: The first is keyspace name, second is sstable name (like transaction-hc-1024-Data.db -Original Message- From: Radim Kolar h...@filez.com To: user@cassandra.apache.org Sent: Wed, 04 Apr 2012 3:14 Subject: Re: size tiered compaction - improvement Dne 3.4.2012 23 tel:34201223:04, i...@4friends.od.ua mailto:i...@4friends.od.ua napsal(a): if you know for sure that you will free lot of space compacting some old table, then you can call UserdefinedCompaction for this table(you can do this from cron). There is also a ticket in jira with discussion on per-sstable expierd column and tombstones counters. you are talking about CompactionManager,forceUserDefinedCompaction mbean? it takes 2 argumenents, no description on them. i never got this work. NoSuchElementException returned
Re: cassandra 1.08 on java7 and win7
thank u Gopala :) Der is no issue with it.. Might be i was typing something wrong.,. Minor mistake :) On Tue, Apr 3, 2012 at 11:51 PM, Gopala f2001...@gmail.com wrote: puneet loya puneetloya at gmail.com writes: create keyspace DEMO with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options=[{datacenter1:1}]; try it n check if it executes Hi Puneet, I have same issue. Running the command you mentioned below works for me. What is the issue here? Gopala
Re: data size difference between supercolumn and regular column
Do you have a good reference for maintenance scripts for Cassandra ring? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Apr 3, 2012 at 4:37 AM, aaron morton aa...@thelastpickle.comwrote: If you have a workload with overwrites you will end up with some data needing compaction. Running a nightly manual compaction would remove this, but it will also soak up some IO so it may not be the best solution. I do not know if Leveled compaction would result in a smaller disk load for the same workload. I agree with other people, turn on compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 3/04/2012, at 9:19 AM, Yiming Sun wrote: Yup Jeremiah, I learned a hard lesson on how cassandra behaves when it runs out of disk space :-S.I didn't try the compression, but when it ran out of disk space, or near running out, compaction would fail because it needs space to create some tmp data files. I shall get a tatoo that says keep it around 50% -- this is valuable tip. -- Y. On Sun, Apr 1, 2012 at 11:25 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: Is that 80% with compression? If not, the first thing to do is turn on compression. Cassandra doesn't behave well when it runs out of disk space. You really want to try and stay around 50%, 60-70% works, but only if it is spread across multiple column families, and even then you can run into issues when doing repairs. -Jeremiah On Apr 1, 2012, at 9:44 PM, Yiming Sun wrote: Thanks Aaron. Well I guess it is possible the data files from sueprcolumns could've been reduced in size after compaction. This bring yet another question. Say I am on a shoestring budget and can only put together a cluster with very limited storage space. The first iteration of pushing data into cassandra would drive the disk usage up into the 80% range. As time goes by, there will be updates to the data, and many columns will be overwritten. If I just push the updates in, the disks will run out of space on all of the cluster nodes. What would be the best way to handle such a situation if I cannot to buy larger disks? Do I need to delete the rows/columns that are going to be updated, do a compaction, and then insert the updates? Or is there a better way? Thanks -- Y. On Sat, Mar 31, 2012 at 3:28 AM, aaron morton aa...@thelastpickle.comwrote: does cassandra 1.0 perform some default compression? No. The on disk size depends to some degree on the work load. If there are a lot of overwrites or deleted you may have rows/columns that need to be compacted. You may have some big old SSTables that have not been compacted for a while. There is some overhead involved in the super columns: the super col name, length of the name and the number of columns. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 29/03/2012, at 9:47 AM, Yiming Sun wrote: Actually, after I read an article on cassandra 1.0 compression just now ( http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression), I am more puzzled. In our schema, we didn't specify any compression options -- does cassandra 1.0 perform some default compression? or is the data reduction purely because of the schema change? Thanks. -- Y. On Wed, Mar 28, 2012 at 4:40 PM, Yiming Sun yiming@gmail.comwrote: Hi, We are trying to estimate the amount of storage we need for a production cassandra cluster. While I was doing the calculation, I noticed a very dramatic difference in terms of storage space used by cassandra data files. Our previous setup consists of a single-node cassandra 0.8.x with no replication, and the data is stored using supercolumns, and the data files total about 534GB on disk. A few weeks ago, I put together a cluster consisting of 3 nodes running cassandra 1.0 with replication factor of 2, and the data is flattened out and stored using regular columns. And the aggregated data file size is only 488GB (would be 244GB if no replication). This is a very dramatic reduction in terms of storage needs, and is certainly good news in terms of how much storage we need to provision. However, because of the dramatic reduction, I also would like to make sure it is absolutely correct before submitting it - and also get a sense of why there was such a difference. -- I know cassandra 1.0 does data compression, but does the schema change from supercolumn to regular column also help reduce storage usage? Thanks. -- Y. tokLogo.png
Re: counter column family
Hi! So, if I am using Hector, I need to do: cassandraHostConfigurator.setRetryDownedHosts(false)? How will this affect my application generally? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:25 PM, R. Verlangen ro...@us2.nl wrote: You should use a connection pool without retries to prevent a single increment of +1 have a result of e.g. +3. 2012/3/27 Rishabh Agrawal rishabh.agra...@impetus.co.in You can even define how much increment you want. But let me just warn you, as far my knowledge, it has consistency issues. *From:* puneet loya [mailto:puneetl...@gmail.com] *Sent:* Tuesday, March 27, 2012 5:59 PM *To:* user@cassandra.apache.org *Subject:* Re: counter column family thanxx a ton :) :) the counter column family works synonymous as 'auto increment' in other databases rite? I mean we have a column of type integer which increments with every insert. Am i goin the rite way?? please reply :) On Tue, Mar 27, 2012 at 5:50 PM, R. Verlangen ro...@us2.nl wrote: *create column family MyCounterColumnFamily with default_validation_class=CounterColumnType and key_validation_class=UTF8Type and comparator=UTF8Type;* There you go! Keys must be utf8, as well as the column names. Of course you can change those validators. Cheers! 2012/3/27 puneet loya puneetl...@gmail.com Can u give an example of create column family with counter column in it. Please reply Regards, Puneet Loya -- With kind regards, Robin Verlangen www.robinverlangen.nl -- Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know more about our Big Data quick-start program at the event. New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’ available at http://bit.ly/z6zT4L. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. -- With kind regards, Robin Verlangen www.robinverlangen.nl tokLogo.png