Re: large range read in Cassandra
For the benefit of others, I ended up finding out that the CQL library I was using (https://github.com/gocql/gocql) at this time leaves paging page size defaulted to no paging, so Cassandra was trying to pull all rows of the partition into memory at once. Setting the page size to a reasonable number seems to have done the trick. On Tue, Nov 25, 2014 at 2:54 PM, Dan Kinder dkin...@turnitin.com wrote: Thanks, very helpful Rob, I'll watch for that. On Tue, Nov 25, 2014 at 11:45 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder dkin...@turnitin.com wrote: To be clear, I expect this range query to take a long time and perform relatively heavy I/O. What I expected Cassandra to do was use auto-paging ( https://issues.apache.org/jira/browse/CASSANDRA-4415, http://stackoverflow.com/questions/17664438/iterating-through-cassandra-wide-row-with-cql3) so that we aren't literally pulling the entire thing in. Am I misunderstanding this use case? Could you clarify why exactly it would slow way down? It seems like with each read it should be doing a simple range read from one or two sstables. If you're paging through a single partition, that's likely to be fine. When you said range reads ... over rows my impression was you were talking about attempting to page through millions of partitions. With that confusion cleared up, the likely explanation for lack of availability in your case is heap pressure/GC time. Look for GCs around that time. Also, if you're using authentication, make sure that your authentication keyspace has a replication factor greater than 1. =Rob -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
Re: large range read in Cassandra
Thanks Rob. To be clear, I expect this range query to take a long time and perform relatively heavy I/O. What I expected Cassandra to do was use auto-paging ( https://issues.apache.org/jira/browse/CASSANDRA-4415, http://stackoverflow.com/questions/17664438/iterating-through-cassandra-wide-row-with-cql3) so that we aren't literally pulling the entire thing in. Am I misunderstanding this use case? Could you clarify why exactly it would slow way down? It seems like with each read it should be doing a simple range read from one or two sstables. If this won't work then it may me we need to start using Hive/Spark/Pig etc. sooner, or page it manually using LIMIT and WHERE [the last returned result]. On Mon, Nov 24, 2014 at 5:49 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Nov 24, 2014 at 4:26 PM, Dan Kinder dkin...@turnitin.com wrote: We have a web crawler project currently based on Cassandra ( https://github.com/iParadigms/walker, written in Go and using the gocql driver), with the following relevant usage pattern: - Big range reads over a CF to grab potentially millions of rows and dispatch new links to crawl If you really mean millions of storage rows, this is just about the worst case for Cassandra. The problem you're having is probably that you shouldn't try to do this in Cassandra. Your timeouts are either from the read actually taking longer than the timeout or from the reads provoking heap pressure and resulting GC. =Rob
Re: large range read in Cassandra
On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder dkin...@turnitin.com wrote: To be clear, I expect this range query to take a long time and perform relatively heavy I/O. What I expected Cassandra to do was use auto-paging ( https://issues.apache.org/jira/browse/CASSANDRA-4415, http://stackoverflow.com/questions/17664438/iterating-through-cassandra-wide-row-with-cql3) so that we aren't literally pulling the entire thing in. Am I misunderstanding this use case? Could you clarify why exactly it would slow way down? It seems like with each read it should be doing a simple range read from one or two sstables. If you're paging through a single partition, that's likely to be fine. When you said range reads ... over rows my impression was you were talking about attempting to page through millions of partitions. With that confusion cleared up, the likely explanation for lack of availability in your case is heap pressure/GC time. Look for GCs around that time. Also, if you're using authentication, make sure that your authentication keyspace has a replication factor greater than 1. =Rob
Re: large range read in Cassandra
Thanks, very helpful Rob, I'll watch for that. On Tue, Nov 25, 2014 at 11:45 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder dkin...@turnitin.com wrote: To be clear, I expect this range query to take a long time and perform relatively heavy I/O. What I expected Cassandra to do was use auto-paging ( https://issues.apache.org/jira/browse/CASSANDRA-4415, http://stackoverflow.com/questions/17664438/iterating-through-cassandra-wide-row-with-cql3) so that we aren't literally pulling the entire thing in. Am I misunderstanding this use case? Could you clarify why exactly it would slow way down? It seems like with each read it should be doing a simple range read from one or two sstables. If you're paging through a single partition, that's likely to be fine. When you said range reads ... over rows my impression was you were talking about attempting to page through millions of partitions. With that confusion cleared up, the likely explanation for lack of availability in your case is heap pressure/GC time. Look for GCs around that time. Also, if you're using authentication, make sure that your authentication keyspace has a replication factor greater than 1. =Rob -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
large range read in Cassandra
Hi, We have a web crawler project currently based on Cassandra ( https://github.com/iParadigms/walker, written in Go and using the gocql driver), with the following relevant usage pattern: - Big range reads over a CF to grab potentially millions of rows and dispatch new links to crawl - Fast insert of new links (effectively using Cassandra to deduplicate) We ultimately planned on doing the batch processing step (the dispatching) in a system like Spark, but for the time being it is also in Go. We believe this should work fine given that Cassandra now properly allows chunked iteration of columns in a CF. The issue is, periodically while doing a particularly large range read, other operations time out because that node is busy. In an experimental cluster with only two nodes (and replication factor of 2), I'll get an error like: Operation timed out - received only 1 responses. Indicating that the second node took too long to reply. At the moment I have the long range reads set to consistency level ANY but the rest of the operations are on QUORUM, so on this cluster they require responses from both nodes. The relevant CF is also using LeveledCompactionStrategy. This happens in both Cassandra 2 and 2.1. Despite this error I don't see any significant I/O, memory consumption, or CPU usage. Here are some of the configuration values I've played with: Increasing timeouts: read_request_timeout_in_ms: 15000 range_request_timeout_in_ms: 3 write_request_timeout_in_ms: 1 request_timeout_in_ms: 1 Getting rid of caches we don't need: key_cache_size_in_mb: 0 row_cache_size_in_mb: 0 Each of the 2 nodes has an HDD for commit log and single HDD I'm using for data. Hence the following thread config (maybe since I/O is not an issue I should increase these?): concurrent_reads: 16 concurrent_writes: 32 concurrent_counter_writes: 32 Because I have a large number columns and aren't doing random I/O I've increased this: column_index_size_in_kb: 2048 It's something of a mystery why this error comes up. Of course with a 3rd node it will get masked if I am doing QUORUM operations, but it still seems like it should not happen, and that there is some kind of head-of-line blocking or other issue in Cassandra. I would like to increase the amount of dispatching I'm doing because of this it bogs it down if I do. Any suggestions for other things we can try here would be appreciated. -dan
Re: large range read in Cassandra
On Mon, Nov 24, 2014 at 4:26 PM, Dan Kinder dkin...@turnitin.com wrote: We have a web crawler project currently based on Cassandra ( https://github.com/iParadigms/walker, written in Go and using the gocql driver), with the following relevant usage pattern: - Big range reads over a CF to grab potentially millions of rows and dispatch new links to crawl If you really mean millions of storage rows, this is just about the worst case for Cassandra. The problem you're having is probably that you shouldn't try to do this in Cassandra. Your timeouts are either from the read actually taking longer than the timeout or from the reads provoking heap pressure and resulting GC. =Rob