On Mon, Nov 24, 2014 at 4:26 PM, Dan Kinder <dkin...@turnitin.com> wrote:

> We have a web crawler project currently based on Cassandra (
> https://github.com/iParadigms/walker, written in Go and using the gocql
> driver), with the following relevant usage pattern:
>
> - Big range reads over a CF to grab potentially millions of rows and
> dispatch new links to crawl
>

If you really mean millions of storage rows, this is just about the worst
case for Cassandra. The problem you're having is probably that you
shouldn't try to do this in Cassandra.

Your timeouts are either from the read actually taking longer than the
timeout or from the reads provoking heap pressure and resulting GC.

=Rob

Reply via email to