Distinct Counter Proposal for Cassandra
Hi All, Let's assume we have a use case where we need to count the number of columns for a given key. Let's say the key is the URL and the column-name is the IP address or any cardinality identifier. The straight forward implementation seems to be simple, just inserting the IP Adresses as columns under the key defined by the URL and using get_count to count them back. However the problem here is in case of large rows (where too many IP addresses are in); the get_count method has to de-serialize the whole row and calculate the count. As also defined in the user guides, it's not an O(1) operation and it's quite costly. However, this problem seems to have better solutions if you don't have a strict requirement for the count to be exact. There are streaming algorithms that will provide good cardinality estimations within a predefined failure rate, I think the most popular one seems to be the (Hyper)LogLog algorithm, also there's an optimal one developed recently, please check http://dl.acm.org/citation.cfm?doid=1807085.1807094 If you want to take a look at the Java implementation for LogLog, Clearspring has both LogLog and space optimized HyperLogLog available at https://github.com/clearspring/stream-lib I don't see a reason why this can't be implemented in Cassandra. The distributed nature of all these algorithms can easily be adapted to Cassandra's model. I think most of us would love to see come cardinality estimating columns in Cassandra. Regards, Utku
Re: Distinct Counter Proposal for Cassandra
Hi Yuki, I think I should have used the word discussion instead of proposal for the mailing subject. I have quite some of a design in my mind but I think it's not yet ripe enough to formalize. I'll try to simplify it and open a Jira ticket. But first I'm wondering if there would be any excitement in the community for such a feature. Regards, Utku On Wed, Jun 13, 2012 at 7:00 PM, Yuki Morishita mor.y...@gmail.com wrote: You can open JIRA ticket at https://issues.apache.org/jira/browse/CASSANDRA with your proposal. Just for the input: I had once implemented HyperLogLog counter to use internally in Cassandra, but it turned out I didn't need it so I just put it to gist. You can find it here: https://gist.github.com/2597943 The above implementation and most of the other ones (including stream-lib) implement the optimized version of the algorithm which counts up to 10^9, so may need some work. Other alternative is self-learning bitmap ( http://ect.bell-labs.com/who/aychen/sbitmap4p.pdf) which, in my understanding, is more memory efficient when counting small values. Yuki On Wednesday, June 13, 2012 at 11:28 AM, Utku Can Topçu wrote: Hi All, Let's assume we have a use case where we need to count the number of columns for a given key. Let's say the key is the URL and the column-name is the IP address or any cardinality identifier. The straight forward implementation seems to be simple, just inserting the IP Adresses as columns under the key defined by the URL and using get_count to count them back. However the problem here is in case of large rows (where too many IP addresses are in); the get_count method has to de-serialize the whole row and calculate the count. As also defined in the user guides, it's not an O(1) operation and it's quite costly. However, this problem seems to have better solutions if you don't have a strict requirement for the count to be exact. There are streaming algorithms that will provide good cardinality estimations within a predefined failure rate, I think the most popular one seems to be the (Hyper)LogLog algorithm, also there's an optimal one developed recently, please check http://dl.acm.org/citation.cfm?doid=1807085.1807094 If you want to take a look at the Java implementation for LogLog, Clearspring has both LogLog and space optimized HyperLogLog available at https://github.com/clearspring/stream-lib I don't see a reason why this can't be implemented in Cassandra. The distributed nature of all these algorithms can easily be adapted to Cassandra's model. I think most of us would love to see come cardinality estimating columns in Cassandra. Regards, Utku
Re: last record rowId
As far as I can tell, this functionality doesn't exist. However you can use such a method to insert the rowId into another column within a seperate row, and request the latest column. I think this would work for you. However every insert would need a get request, which I think would be performance issue somehow. Regards, Utku On Wed, Jun 15, 2011 at 11:14 AM, karim abbouh karim_...@yahoo.fr wrote: in my java application,when we try to insert we should all the time know the last rowId in order the insert the new record in rowId+1,so for that we should save this rowId in a file is there other way to know the last record rowId? thanks B.R
Re: Corrupted Counter Columns
Hello, Actually I did not have the patience to discover more on what's going on. I had to drop the CF and start from scratch. Even though there were no writes to those particular columns, while reading at CL.ONE there was a 50% chance that - The query returned the correct value (51664) - The query returned a non-sense value (18651001) (I say this is non-sense because the there were not more than 52K increment requests and all increments are actually +1 increments) Aftet starting from scratch; I'm writing with CL.ONE and reading with CL.QUORUM. Things seems to work fine. On Fri, May 27, 2011 at 1:59 PM, Sylvain Lebresne sylv...@datastax.comwrote: On Thu, May 26, 2011 at 2:21 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Hello, I'm using the the 0.8.0-rc1, with RF=2 and 4 nodes. Strangely counters are corrupted. Say, the actual value should be : 51664 and the value that cassandra sometimes outputs is: either 51664 or 18651001. What does sometimes means in that context ? Is it like some query returns the former and some other the latter ? Does it alternate in the value returned despite no write coming in or does this at least stabilize to one of those value. Could you give more details on how this manifests itself. Does it depends on which node you connect to for the request for instance, does querying at QUORUM solves it ? And I have no idea on how to diagnose the problem or reproduce it. Can you help me in fixing this issue? Regards, Utku
Re: expiring + counter column?
How about implementing a freezing mechanism on counter columns. If there are no more increments within freeze seconds after the last increments (it would be orders or day or so); the column would lock itself on increments and won't accept increment. And after this freeze perioid, the ttl should work fine. The column will be gone forever after freeze + ttl seconds. On Sat, May 28, 2011 at 2:57 AM, Jonathan Ellis jbel...@gmail.com wrote: No. See comments to https://issues.apache.org/jira/browse/CASSANDRA-2103 On Fri, May 27, 2011 at 7:29 PM, Yang tedd...@gmail.com wrote: is this combination feature available , or on track ? thanks Yang -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Corrupted Counter Columns
Hello, I'm using the the 0.8.0-rc1, with RF=2 and 4 nodes. Strangely counters are corrupted. Say, the actual value should be : 51664 and the value that cassandra sometimes outputs is: either 51664 or 18651001. And I have no idea on how to diagnose the problem or reproduce it. Can you help me in fixing this issue? Regards, Utku
Re: Corrupted Counter Columns
Some additional information on the settings: I'm using CL.ONE for both reading and writing; and replicate_on_write is true on the Counters CF. I think the problem occurs after a restart when the commitlogs are read. On Thu, May 26, 2011 at 2:21 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Hello, I'm using the the 0.8.0-rc1, with RF=2 and 4 nodes. Strangely counters are corrupted. Say, the actual value should be : 51664 and the value that cassandra sometimes outputs is: either 51664 or 18651001. And I have no idea on how to diagnose the problem or reproduce it. Can you help me in fixing this issue? Regards, Utku
CounterColumn increments gone after restart
Hi guys, I have strange problem with 0.8.0-rc1. I'm not quite sure if this is the way it should be but: - I create a ColumnFamily named Counters - do a few increments on a column. - kill cassandra - start cassandra When I look at the counter column, the value is 1. See the following pastebin please: http://pastebin.com/9jYdDiRY
Re: CounterColumn increments gone after restart
see the ticket https://issues.apache.org/jira/browse/CASSANDRA-2642 please On Thu, May 12, 2011 at 3:28 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi guys, I have strange problem with 0.8.0-rc1. I'm not quite sure if this is the way it should be but: - I create a ColumnFamily named Counters - do a few increments on a column. - kill cassandra - start cassandra When I look at the counter column, the value is 1. See the following pastebin please: http://pastebin.com/9jYdDiRY
Does counter columns support TTL
Hi All, I'm experimenting and developing using counters. However, I've come to a usecase where I need counters to expire and get deleted after a certain time of inactivity (i.e. have countercolumn deleted one hour after the last increment). As far as I can tell counter columns don't have TTL in the thrift interface, is it because of a limitation of the counter implementation? Regards, Utku
Re: Commercial support for cassandra
http://wiki.apache.org/cassandra/ThirdPartySupport On Thu, Feb 17, 2011 at 12:20 AM, Sal Fuentes fuente...@gmail.com wrote: They also offer great training sessions. Have a look at their site for more information: http://www.datastax.com/about-us On Wed, Feb 16, 2011 at 3:13 PM, Michael Widmann michael.widm...@gmail.com wrote: riptano - contact matt pfeil mike 2011/2/17 A J s5a...@gmail.com By any chance are there companies that provide support for Cassandra ? Consult on setup and configuration and annual support packages ? -- bayoda.com - Professional Online Backup Solutions for Small and Medium Sized Companies -- Salvador Fuentes Jr.
Re: Does counter columns support TTL
Can anyone confirm that this patch works with the current trunk? On Thu, Feb 17, 2011 at 4:16 PM, Sylvain Lebresne sylv...@datastax.comwrote: https://issues.apache.org/jira/browse/CASSANDRA-2103 On Thu, Feb 17, 2011 at 4:05 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi All, I'm experimenting and developing using counters. However, I've come to a usecase where I need counters to expire and get deleted after a certain time of inactivity (i.e. have countercolumn deleted one hour after the last increment). As far as I can tell counter columns don't have TTL in the thrift interface, is it because of a limitation of the counter implementation? Regards, Utku
Re: Does counter columns support TTL
And I think this patch would still be useful and legitimate if the TTL of the initial increment is taken into account. On Thu, Feb 17, 2011 at 6:11 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Yes, I've read the discussion. My use-case is similar to the use-case of the contributor. So that's the reason why I've asked if it works or not. (with the flaw of course). On Thu, Feb 17, 2011 at 5:41 PM, Jonathan Ellis jbel...@gmail.com wrote: If you read the discussion on that ticket, the point is that the approach is fundamentally flawed. On Thu, Feb 17, 2011 at 10:16 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Can anyone confirm that this patch works with the current trunk? On Thu, Feb 17, 2011 at 4:16 PM, Sylvain Lebresne sylv...@datastax.com wrote: https://issues.apache.org/jira/browse/CASSANDRA-2103 On Thu, Feb 17, 2011 at 4:05 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi All, I'm experimenting and developing using counters. However, I've come to a usecase where I need counters to expire and get deleted after a certain time of inactivity (i.e. have countercolumn deleted one hour after the last increment). As far as I can tell counter columns don't have TTL in the thrift interface, is it because of a limitation of the counter implementation? Regards, Utku -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Implemeting a LRU in Cassandra
Dear Aaron, Thank you for your suggestion. I'll be evaluating it. Since all my other use cases are implemented in Cassandra, now I had the question in my mind, if it was possible to implement the sorted set in Cassandra :) The problem here is, in a few hours I might be resolving more than 2M pages. Using redis would also cause a problem on deletion it seems so. However in cassandra I might be trusting the expritation of the columns. It looks like the sorted set won't be able to support partitioning, thus won't be scalable at the end of the day. Regards, Utku On Thu, Feb 10, 2011 at 9:54 AM, aaron morton aa...@thelastpickle.comwrote: FWIW and depending on the size of data, I would use consider using sorted sets in redis http://redis.io/commands#sorted_set Where the member is the page url and the weight is time stamp, use ZRANGE to get back the top 1,000 entries in the set. Would that work for you? Aaron On 9 Feb 2011, at 23:58, Utku Can Topçu wrote: Hi All, I'm sure people here have tried to solve similar questions. Say I'm tracking pages, I want to access the least recently used 1000 unique pages (i.e. columnnames). How can I achieve this? Using a row with say, ttl=60 seconds would solve the problem of accessing the least recently used unique pages in the last minute. Thanks for any comments and helps. Regards, Utku
Re: Super Slow Multi-gets
Dear Bill, How about the size of the row in the Messages CF. Is it too big? Might you be having an overhead of the bandwidth? Regards, Utku On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs bill.spe...@gmail.com wrote: I have a 7 node setup with a replication factor of 1 and a read consistency of 1. I have two column families: Messages which stores millions of rows with a UUID for the row key, DateIndex which stores thousands of rows with a String as the row key. I perform 2 look-ups for my queries: 1) Fetch the row from DateIndex that includes the date I'm looking for. This returns 1,000 columns where the column names are the UUID of the messages 2) Do a multi-get (Hector client) using those 1,000 row keys I got from the first query. Query 1 is taking ~300ms to fetch 1,000 columns from a single row... respectable. However, query 2 is taking over 50s to perform 1,000 row look-ups! Also, when I scale down to 100 row look-ups for query 2, the time scales in a similar fashion, down to 5s. Am I doing something wrong here? It seems like taking 5s to look-up 100 rows in a distributed hash table is way too slow. Thoughts? Bill-
Re: Super Slow Multi-gets
Bill, It still sounds really strange. Can you reproduce it? And note down the steps; I'm sure people here would be pleased to repeat it. Regards, Utku On Fri, Feb 11, 2011 at 5:34 AM, Mark Guzman segfa...@hasno.info wrote: I assume this should be set on all of the servers? Is there anything in particular one would look for in the log results? On Feb 10, 2011, at 4:37 PM, Aaron Morton wrote: Assuming cassandra 0.7 in log4j-server.properties make it look like this... log4j.rootLogger=DEBUG,stdout,R A On 11 Feb, 2011,at 10:30 AM, Bill Speirs bill.spe...@gmail.com wrote: I switched my implementation to use a thread pool of 10 threads each multi-getting 10 keys/rows. This reduces my time from 50s to 5s for fetching all 1,000 messages. I started looking through the Cassandra source to find where the parallel requests are actually made, and I believe it's in org.apache.cassandra.service.StorageProxy.java fetchRows, is this correct? I noticed a number of logger.debug calls, what do I need to set in my log4j.properties file to see these messages as they would probably help me determine what is taking so long. Currently my log4j.properties file looks like this and I'm not seeing these messages: log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout log4j.category.org.apache=DEBUG, stdout log4j.category.me.prettyprint=DEBUG, stdout Thanks... Bill- On Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs bill.spe...@gmail.com wrote: Each message row is well under 1K. So I don't think it is network... plus all boxes are on a fast LAN. Bill- On Feb 10, 2011 11:59 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Dear Bill, How about the size of the row in the Messages CF. Is it too big? Might you be having an overhead of the bandwidth? Regards, Utku On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs bill.spe...@gmail.com wrote: I have a 7 node setup with a replication factor of 1 and a read consistency of 1 I have two column families: Messages which stores millions of rows with a UUID for the row key, DateIndex which stores thousands of rows with a String as the row key. I perform 2 look-ups for my queries: 1) Fetch the row from DateIndex that includes the date I'm looking for. This returns 1,000 columns where the column names are the UUID of the messages 2) Do a multi-get (Hector client) using those 1,000 row keys I got from the first query. Query 1 is taking ~300ms to fetch 1,000 columns from a single row... respectable. However, query 2 is taking over 50s to perform 1,000 row look-ups! Also, when I scale down to 100 row look-ups for query 2, the time scales in a similar fashion, down to 5s. Am I doing something wrong here? It seems like taking 5s to look-up 100 rows in a distributed hash table is way too slow. Thoughts? Bill-
Implemeting a LRU in Cassandra
Hi All, I'm sure people here have tried to solve similar questions. Say I'm tracking pages, I want to access the least recently used 1000 unique pages (i.e. columnnames). How can I achieve this? Using a row with say, ttl=60 seconds would solve the problem of accessing the least recently used unique pages in the last minute. Thanks for any comments and helps. Regards, Utku
Re: Hadoop Integration doesn't work when one node is down
I've created an issue, was this what you were asking Jonathan? https://issues.apache.org/jira/browse/CASSANDRA-1927 On Mon, Jan 3, 2011 at 12:24 AM, Jonathan Ellis jbel...@gmail.com wrote: Can you create one? On Sun, Jan 2, 2011 at 4:39 PM, mck m...@apache.org wrote: Is this a bug or feature or a misuse? i can confirm this bug. on a 3 node cluster testing environment with RF 3. (and no issue exists for it AFAIK). ~mck -- Simplicity is the ultimate sophistication Leonardo Da Vinci's (William of Ockham) | www.semb.wever.org | www.sesat.no | www.finn.no| http://xss-http-filter.sf.net -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Replacing nodes of the cluster in 0.7.0-RC1
Since no reply came in afew days, I tried my proposed steps and it all worked fine. Just to let you know. On Sat, Dec 4, 2010 at 10:31 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi All, I'm currently not happy with the hardware and the operating system of our 4-node cassandra cluster. I'm planning to move the cluster to a different hardware/OS architecture. For this purpose I'm planning to bring up 4 new nodes, so that each node will be a replacement of another node in the current cluster. I would also like to note that the IP adresses will be also changing. As far as I remember, cassandra had been causing problems when there was an IP change back in version 0.6? So what steps should I take to achieve this? Will s straight forward approach like this work?, * drain all nodes * copy the data files to new hosts * change configration, seeds, datadir, tokens etc... * bring up the cluster Regards, Utku
Replacing nodes of the cluster in 0.7.0-RC1
Hi All, I'm currently not happy with the hardware and the operating system of our 4-node cassandra cluster. I'm planning to move the cluster to a different hardware/OS architecture. For this purpose I'm planning to bring up 4 new nodes, so that each node will be a replacement of another node in the current cluster. I would also like to note that the IP adresses will be also changing. As far as I remember, cassandra had been causing problems when there was an IP change back in version 0.6? So what steps should I take to achieve this? Will s straight forward approach like this work?, * drain all nodes * copy the data files to new hosts * change configration, seeds, datadir, tokens etc... * bring up the cluster Regards, Utku
Detecting failed nodes and restarting
Hi All, The question is really simple. Is there anyone out there using a set of scripts in production that detects failures of cassandra processes and restarts them or takes required actions. If so, how can we implement a generic solution for this problem? Regards, Utku
Deleting the datadir for system keyspace in 0.7
Hello All, I'm wondering before restarting the a node in a cluster. If I delete the system keyspace, what data would I be losing, would I be losing anything? Regards, Utku
Re: Deleting the datadir for system keyspace in 0.7
So, The practice of deleting the system datadir and setting the token in the configuration (so that we're not losing it) can be treated as a safe(!) operation if we're OK to lose the hints? Or are there other things to be aware of? Regards, Utku On Mon, Nov 15, 2010 at 3:25 PM, Jonathan Ellis jbel...@gmail.com wrote: ... but blowing away your saved token is a great way to lose data if you don't know what you're doing. On Mon, Nov 15, 2010 at 8:17 AM, Gary Dusbabek gdusba...@gmail.com wrote: Mostly these things: stored schema information, cached cluster info, the token, hints. Everything but the hints can be replaced. Gary. On Mon, Nov 15, 2010 at 06:29, Utku Can Topçu u...@topcu.gen.tr wrote: Hello All, I'm wondering before restarting the a node in a cluster. If I delete the system keyspace, what data would I be losing, would I be losing anything? Regards, Utku -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Cassandra Hadoop Integration not compatible with Hadoop 0.21.0
When I try to read a CF from Hadoop, just after issuing the run I get this error: Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:88) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:401) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:418) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:338) at org.apache.hadoop.mapreduce.Job.submit(Job.java:960) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:976) However the same code works fine for hadoop 0.20.2? Is there a prospective patch for this issue? Regards, Utku
Re: Time to wait for CF to be consistent after stopping writes.
Gary, Thank you for your comments. I also have another question in mind: - If in all nodes nodetool cfstats shows that the memtable size is 0. Then can I be sure that it's safe to assume that all values are consistent? Regards, Utku On Wed, Oct 27, 2010 at 3:24 PM, Gary Dusbabek gdusba...@gmail.com wrote: On Wed, Oct 27, 2010 at 05:08, Utku Can Topçu u...@topcu.gen.tr wrote: Hi, For a columnfamily in a keyspace which has RF=3, I'm issuing writes with ConsistencyLevel.ONE. in the configuration I have: - memtable_flush_after_mins : 30 - memtable_throughput_in_mb : 32 I'm writing to this columnfamily continuously for about 1 hour then stop writing. So the question is: How long should I wait after stopping writes to that particular CF so that all writes take place and data contained in the CF will be consistent. There is no way to determine this precisely. Depending on your nodes and network it could be as short as a few milliseconds or much longer. Which metrics should I be checking to ensure that the CF is now consistent? Execute a read using ConsistencyLevel.ALL. If the value is not yet consistent, read repair will ensure that it soon will be. Another approach is to write using ConsistencyLevel.ALL, although that would decrease your write throughput. And additionally if I was using ConsistencyLevel.QUORUM or ConsistencyLevel.ALL would it make a difference? Precisely. Would reducing the RF=3 to RF=1 would it make my life on this decision easier? It would make determining consistency better, but RF=1 isn't going to be very fault tolerant. Gary.
Time to wait for CF to be consistent after stopping writes.
Hi, For a columnfamily in a keyspace which has RF=3, I'm issuing writes with ConsistencyLevel.ONE. in the configuration I have: - memtable_flush_after_mins : 30 - memtable_throughput_in_mb : 32 I'm writing to this columnfamily continuously for about 1 hour then stop writing. So the question is: How long should I wait after stopping writes to that particular CF so that all writes take place and data contained in the CF will be consistent. Which metrics should I be checking to ensure that the CF is now consistent? And additionally if I was using ConsistencyLevel.QUORUM or ConsistencyLevel.ALL would it make a difference? Would reducing the RF=3 to RF=1 would it make my life on this decision easier? Regards, Utku
Reading a keyrange when using RP
If I'm not mistaken cassandra has been providing support for keyrange queries also on RP. However when I try to define a keyrange such as, start: (key100, end: key200) I get an error like: InvalidRequestException(why:start key's md5 sorts after end key's md5. this is not allowed; you probably should not specify end key at all, under RandomPartitioner) How can I utilize cassandra to get a keyrange in RP? Best Regards, Utku
creating and dropping columnfamilies as a usecase
Hi All, In the current project I'm working on. I have use case for hourly analyzing the rows. Since the 0.7x branch supports creating and dropping columnfamilies on the fly; My use case proposal will be: * Create a CF at the very beginning of every hour * At the end of the 1-hour period, analyze the data stored in the CF with Hadoop * Drop the CF afterwards. Can you foresee any problems in continiously creating and dropping columnfamilies? Regards, Utku
using jna.jar Unknown mlockall error 0
Hi, In order to continue on memory optimizations, I've been trying to use the JNA. However, when I copy the jna.jar to the lib directory? I get the warning. I'm currently running the 0.6.5 version of cassandra. WARN [main] 2010-10-08 09:16:18,924 FBUtilities.java (line 595) Unknown mlockall error 0 Should I be worried about this warning, in case JNA might not be working? Regards, Utku
Re: using jna.jar Unknown mlockall error 0
I'm running an Ubuntu 9.10 linux box. On Fri, Oct 8, 2010 at 11:33 AM, Roger Schildmeijer schildmei...@gmail.comwrote: On Fri, Oct 8, 2010 at 11:27 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi, In order to continue on memory optimizations, I've been trying to use the JNA. However, when I copy the jna.jar to the lib directory? I get the warning. I'm currently running the 0.6.5 version of cassandra. WARN [main] 2010-10-08 09:16:18,924 FBUtilities.java (line 595) Unknown mlockall error 0 Return value == 0 usually indicates that the operation returned successfully (atleast on most moden POSIX systems). What OS are you using? Should I be worried about this warning, in case JNA might not be working? Regards, Utku WBR Roger Schildmeijer
Re: using jna.jar Unknown mlockall error 0
Thanks Nicolas, I've just tried it as running root and the warning did not show up. Do we need to run cassandra as root in order to use JNA? Regards, Utku On Fri, Oct 8, 2010 at 11:45 AM, Nicolas Mathieu nico...@gmail.com wrote: If I'm not wrong, when I run cassandra as root I don't get that mlockall error 0. Maybe there is another solution anyway. nico008 On 08/10/2010 11:33, Roger Schildmeijer wrote: On Fri, Oct 8, 2010 at 11:27 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi, In order to continue on memory optimizations, I've been trying to use the JNA. However, when I copy the jna.jar to the lib directory? I get the warning. I'm currently running the 0.6.5 version of cassandra. WARN [main] 2010-10-08 09:16:18,924 FBUtilities.java (line 595) Unknown mlockall error 0 Return value == 0 usually indicates that the operation returned successfully (atleast on most moden POSIX systems). What OS are you using? Should I be worried about this warning, in case JNA might not be working? Regards, Utku WBR Roger Schildmeijer
Re: Tuning cassandra to use less memory
Hi Oleg, I've been also looking into these after some research. I've been tacking with: 1. Setting the default max and min heap from 1G to 1500M. 2. I'm not using row caches, and the key caches are set to 1000, before they were 200K as default 3. I've lowered the memtable throughput to 32MB 4. We're using a 32-bit JVM - And additionally I've changed the disk access mode to mmap_index_only (this was the only suggested ). - I've also stopped using OPP and switched to RP In our system there are currently 4 nodes, and there's one active keysapce containing 12 active standart columnfamilies. The nodes are still swapping, even though the swappiness is set to zero right now. After swapping comes the OOM. I'm not so sure about what to do but the 1.7 G does not seem to fit our needs? Do you think so? Regards, Utku On Wed, Oct 6, 2010 at 12:47 PM, Oleg Anastasyev olega...@gmail.com wrote: Hi All,We're currently starting to get OOM exceptions in our cluster. I'm trying to push the limiations of our machines. Currently we have 1.7 G memory (ec2-medium)I'm wondering if by tweaking some of cassandra's configuration settings, is it possible to make it live in peace and less memory. 1. What is current java heap size on your nodes ? Is it default 1Gb ? Try to configure more. 2. Do you use row or key caches ? Try to lower their sizes in configuration. 3. What are memtable throughput mb threshold ? You can try to lower it. 4. Do you use 32-bit or 64-bit VM ? For 1.7Gb RAM size 32-bit VM is enough and it uses less RAM. so you can give more to Java heap.
Re: A proposed use case, any comments and experience is appreciated
What I can understand from behaving like a deleted column is - They'll be there for at most GCGraceSeconds? On Mon, Oct 4, 2010 at 3:51 PM, Jonathan Ellis jbel...@gmail.com wrote: Expiring columns are 0.7 only. An expired column behaves like a deleted column until it is compacted away. On Mon, Oct 4, 2010 at 8:48 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi Jonathan, Thank you for mentioning about the expiring columns issue. I didn't know that it had existed. That's really great news. First of all, does the current 0.6 branch support it? If not so, is the patch available for the 0.6.5 somehow? And about the deletion issue, if all the columns in a row expire? When the row will be deleted, will I be seeing the row in my map inputs somehow, and for how long? Regards, Utku On Mon, Oct 4, 2010 at 3:30 PM, Jonathan Ellis jbel...@gmail.com wrote: A simpler approach might be to insert expiring columns into a 2nd CF with a TTL of one hour. On Mon, Oct 4, 2010 at 5:12 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hey All, I'm planning to run Map/Reduce on one of the ColumnFamilies. The keys are formed in such a fashion that, they are indexed in descending order by time. So I'll be analyzing the data for every hour iteratively. Since the current Hadoop integration does not support partial columnfamily analysis. I feel that, I'll need to dump the data of the last hour and put it to the hadoop cluster and do my analysis on the flat text file. Do you think of any other better way of getting the data of a keyrange into a hadoop cluster for analysis? Regards, Utku -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: A proposed use case, any comments and experience is appreciated
Hi Jonathan, Thank you for mentioning about the expiring columns issue. I didn't know that it had existed. That's really great news. First of all, does the current 0.6 branch support it? If not so, is the patch available for the 0.6.5 somehow? And about the deletion issue, if all the columns in a row expire? When the row will be deleted, will I be seeing the row in my map inputs somehow, and for how long? Regards, Utku On Mon, Oct 4, 2010 at 3:30 PM, Jonathan Ellis jbel...@gmail.com wrote: A simpler approach might be to insert expiring columns into a 2nd CF with a TTL of one hour. On Mon, Oct 4, 2010 at 5:12 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hey All, I'm planning to run Map/Reduce on one of the ColumnFamilies. The keys are formed in such a fashion that, they are indexed in descending order by time. So I'll be analyzing the data for every hour iteratively. Since the current Hadoop integration does not support partial columnfamily analysis. I feel that, I'll need to dump the data of the last hour and put it to the hadoop cluster and do my analysis on the flat text file. Do you think of any other better way of getting the data of a keyrange into a hadoop cluster for analysis? Regards, Utku -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Hardware change of a node in the cluster
Hey All, Recently I've tried to upgrade (hw upgrade) one of the nodes in my cassandra cluster from ec2-small to ec2-large. However, there were problems and since the IP of the new instance was different from the previous instance. The other nodes didnot recognize it in the ring. So what should be the best practice for a complete hardware change of one node in the cluster while keeping the data that it has. Regards, Utku
Best strategy for adding new nodes to the cluster
Hi All, We're currently running a cassandra cluster with Replication Factor 3, consisting of 4 nodes. The current situation is: - The nodes are all identical (AWS small instances) - Data directory is in the partition (/mnt) which has 150G capacity and each node has around 90 GB load, so 60 G free space per node is left. So adding a new node to the cluster will seem to cause problems for us. I think the node which will stream the data to the new bootstrapping node, will not have enough disk space for anticompacting its data. What should be the best practice for such scenarios? Regards, Utku
Having different 0.6.x instances in one Cassandra cluster
Hi All, I'm planning to use the current 0.6.4 stable for creating an image that would be the base for nodes in our Cassandra cluster. However, the 0.6.5 release is on the way. When the 0.6.5 has been released. Is it possible to have some of the nodes stay in 0.6.4 and having new nodes in 0.6.5? Best Regards, Utku
Lucene CassandraDirectory Implementation
Hi All, I was browsing through the Lucene JIRA and came across the issue named A Column-Oriented Cassandra-Based Lucene Directory at https://issues.apache.org/jira/browse/LUCENE-2456 Has anyone had a chance to test it? If so, do you think it's an efficient implementation as a replacement for the FSDirectory? Best Regards, Utku
Cassandra Data Model Design Visualization
Hey Guys, I've been into designing an application which consists of more than 20 ColumnFamily's. Each ColumnFamily has some columns referencing to keys in other ColumnFamily's, some keys in ColumnFamily are combination of keys/columns in other ColumnFamily's. I guess most of the people are using these kind of approaches for building a design for an application. Are there any decent visualization schemas for designing Cassandra ColumnFamily's? Best Regards, Utku
Implementing Counter on Cassandra
Hey Guys, Currently in a project I'm involved in, I need to have some columns holding incremented data. The easy approach for implementing a counter with increments is right now as I figured out is read - increment - insert however this approach is not an atomic operation and can easily be corrupted in time. Do you have any best practices in implementing an atomic counter on Cassandra? Best Regards, Utku
Getting keys in a range sorted with respect to last access time
Hey All, First of all I'll start with some questions on the default behavior of get_range_slices method defined in the thrift API. Given a keyrange with start-key kstart and end-key kend, assuming kstartkend; * Is it true that I'll get the range [kstart,kend) (kstart inclusive, kend exclusive)? * What's the default order of the rows in the result list? (assuming I am using an OPP) * (How) can we reverse the sorting order? * What would be the behavior in the case kstartkend? Will I get an empty result list? Secondly, I have use case where I need to access the latest updated rows? How can this be possible? Writing a new partitioner? Best Regards, Utku
Re: Anyone using hadoop/MapReduce integration currently?
Hi Jeremy, Why are you using Cassandra versus using data stored in HDFS or HBase? - I'm thinking of using it for realtime streaming of user data. While streaming the requests, I'm also using Lucandra for indexing the data in realtime. It's a better option when you compare it with HBase or the native HDFS flat files, because of low latency in writes. Is there anything holding you back from using it (if you would like to use it but currently cannot)? My answer to this would be: - The current integration only supports the whole range of the CF to be input for the map phase, it would be way much better if the InputFormat had means of support for a KeyRange. Best Regards, Utku On Tue, May 25, 2010 at 6:35 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: I'll be doing a presentation on Cassandra's (0.6+) hadoop integration next week. Is anyone currently using MapReduce or the initial Pig integration? (If you're unaware of such integration, see http://wiki.apache.org/cassandra/HadoopSupport) If so, could you post to this thread on how you're using it or planning on using it (if not covered by the shroud of secrecy)? e.g. What is the use case? Why are you using Cassandra versus using data stored in HDFS or HBase? Are you using a separate Hadoop cluster to run the MR jobs on, or perhaps are you running the Job Tracker and Task Trackers on Cassandra nodes? Is there anything holding you back from using it (if you would like to use it but currently cannot)? Thanks!
Re: Real-time Web Analysis tool using Cassandra. Doubts...
What makes cassandra a poor choice is the fact that, you can't use a keyrange as input for the map phase for Hadoop. On Wed, May 12, 2010 at 4:37 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, May 11, 2010 at 1:52 PM, Paulo Gabriel Poiati paulogpoi...@gmail.com wrote: - First of all, my first thoughts is to have two CF one for raw client request (~10 millions++ per day) and other for aggregated metrics in some defined inteval time like 1min, 5min, 15min... Is this a good approach ? Sure. - It is a good idea to use a OrderPreservingPartitioner ? To maintain the order of my requests in the raw data CF ? Or the overhead is too big. The problem with OPP isn't overhead (it is lower-overhead than RP) but the tendency to have hotspots in sequentially-written data. - Initially the cluster will contain only three nodes, is it a problem (to few maybe) ? You'll have to do some load testing to see. - I think the best way to do the aggregation job is through a hadoop MapReduce job. Right ? Is there any other way to consider ? Map/Reduce is usually better than rolling your own because it parallelizes for you. - Is really Cassandra suitable for it ? Maybe HBase is better in this case? Nothing here makes me think Cassandra is a poor choice. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Distributed export and import into cassandra
Hey All, I have a simple sample use case, The aim is to export the columns in a column family into flat files with the keys in range from k1 to k2. Since all the nodes in the cluster is supposed to contain some of the distribution of data, is it possible to make each node dump its own local data volume to a flat file? Best Regards, Utku
ColumnFamilyOutputFormat?
Hey All, I've been looking at the documentation and related articles about Cassandra and Hadoop integration, I'm only seeing ColumnFamilyInputFormat for now. What if I want to write directly to cassandra after a reduce? What comes to my mind is, in the Reducer's setup I'd initialize a Cassandra client so that rather than emitting the results to the MR framework, it would be possible to output them to the Cassandra in a simple way. Can you think of any other high level solutions like an OutputFormat or so? Best Regards, Utku
Re: ColumnFamilyInputFormat KeyRange scans on a CF
Do you mean, running the get_range_slices from a single? Yes, it would be reasonable for a relatively small key range, when it comes to analyze a really big range in really big data collection (i.e. like the one we currently populate) having a way for distributing the reads among the cluster seems the only reasonable solution. In this current situation, the best option might be distributing the range among ColumnFamilies (say, 1 CF for each day) and emptying the CF in order to reuse for another day range after analyzing the data. Can you suggest a workaround for this? On Fri, Apr 30, 2010 at 3:22 PM, Jonathan Ellis jbel...@gmail.com wrote: Sounds like doing this w/o m/r with get_range_slices is a reasonable way to go. On Thu, Apr 29, 2010 at 6:04 PM, Utku Can Topçu u...@topcu.gen.tr wrote: I'm currently writing collected data continuously to Cassandra, having keys starting with a timestamp and a unique identifier (like 2009.01.01.00.00.00.RANDOM) for being able to query in time ranges. I'm thinking of running periodical mapreduce jobs which will go through a designated time period. I might want to analyze the data only between 2009.01 and 2009.02. I had done this previously with HBase however I thought cassandra would be a better choice for continuously storing data in a safe manner. I guess this briefly explains my designated use case. Best Regards, Utku On Thu, Apr 29, 2010 at 11:32 PM, Jonathan Ellis jbel...@gmail.com wrote: It's technically possible but 0.6 does not support this, no. What is the use case you are thinking of? On Thu, Apr 29, 2010 at 11:14 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi, I've been trying to use Cassandra for some kind of a supplementary input source for Hadoop MapReduce jobs. The default usage of the ColumnFamilyInputFormat does a full columnfamily scan for using within the MapReduce framework as map input. However I believe that, it should be possible to give a keyrange to scan the specified range. Is it anymeans possible? Best Regards, Utku -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: ColumnFamilyInputFormat KeyRange scans on a CF
I meant in the first sentence running the get_range_slices from a single point On Fri, Apr 30, 2010 at 4:08 PM, Utku Can Topçu u...@topcu.gen.tr wrote: Do you mean, running the get_range_slices from a single? Yes, it would be reasonable for a relatively small key range, when it comes to analyze a really big range in really big data collection (i.e. like the one we currently populate) having a way for distributing the reads among the cluster seems the only reasonable solution. In this current situation, the best option might be distributing the range among ColumnFamilies (say, 1 CF for each day) and emptying the CF in order to reuse for another day range after analyzing the data. Can you suggest a workaround for this?
TimedOutException when using the ColumnFamilyInputFormat
Hey All, I'm trying to run some tests on cassandra an Hadoop integration. I'm basically following the word count example at https://svn.apache.org/repos/asf/cassandra/trunk/contrib/word_count/src/WordCount.javausing the ColumnFamilyInputFormat. Currently I have one-node cassandra and hadoop setup on the same machine. I'm having problems if there are more than one map tasks running on the same node, please find the copy of the error message below. If I limit the map tasks per tasktracker to 1, the MapReduce works fine without anyproblems at all. Do you thinki it's a know issue or am I doing something wrong in implementation. ---error 10/04/29 13:47:37 INFO mapred.JobClient: Task Id : attempt_201004291109_0024_m_00_1, Status : FAILED java.lang.RuntimeException: TimedOutException() at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:165) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:215) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:97) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:91) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: TimedOutException() at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11015) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:142) ... 11 more --- Best Regards, Utku
Re: ColumnFamilyInputFormat KeyRange scans on a CF
I'm currently writing collected data continuously to Cassandra, having keys starting with a timestamp and a unique identifier (like 2009.01.01.00.00.00.RANDOM) for being able to query in time ranges. I'm thinking of running periodical mapreduce jobs which will go through a designated time period. I might want to analyze the data only between 2009.01 and 2009.02. I had done this previously with HBase however I thought cassandra would be a better choice for continuously storing data in a safe manner. I guess this briefly explains my designated use case. Best Regards, Utku On Thu, Apr 29, 2010 at 11:32 PM, Jonathan Ellis jbel...@gmail.com wrote: It's technically possible but 0.6 does not support this, no. What is the use case you are thinking of? On Thu, Apr 29, 2010 at 11:14 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hi, I've been trying to use Cassandra for some kind of a supplementary input source for Hadoop MapReduce jobs. The default usage of the ColumnFamilyInputFormat does a full columnfamily scan for using within the MapReduce framework as map input. However I believe that, it should be possible to give a keyrange to scan the specified range. Is it anymeans possible? Best Regards, Utku -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com