The highlight is "millions rows in a **single** query". Fetching that amount of data in a single query is bad, because the Java heap memory overhead. You can fetch millions of rows in Cassandra, just make sure you do that over thousands or millions of queries, not one single query.

On 12/03/2021 15:32, Joe Obernberger wrote:

One question on the 'millions rows in a single query'.  How would you process that many rows?  At some point, I'd like to be able to process 10-100 billion rows.  Isn't that something that can be done with Cassandra?  I'm coming from HBase where we'd run map reduce jobs.
Thank you.

-Joe

On 3/12/2021 9:07 AM, Bowen Song wrote:

Millions rows in a single query? That sounds like a bad idea to me. Your "NoNodeAvailableException" could be caused by stop-the-world GC pauses, and the GC pauses are likely caused by the query itself.

On 12/03/2021 13:39, Joe Obernberger wrote:

Thank you Paul and Erick.  The keyspace is defined like this:
CREATE KEYSPACE doc WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;

Would that cause this?

The program that is having the problem selects data, calculates stuff, and inserts.  It works with smaller selects, but when the number of rows is in the millions, I start to get this error.  Since it works with smaller sets, I don't believe it to be a network error.  All the nodes are definitely up as other processes are working OK, it's just this one program that fails.

The full stack trace:

Error: com.datastax.oss.driver.api.core.NoNodeAvailableException: No node was available to execute the query com.datastax.oss.driver.api.core.NoNodeAvailableException: No node was available to execute the query         at com.datastax.oss.driver.api.core.NoNodeAvailableException.copy(NoNodeAvailableException.java:40)         at com.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:149)         at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:53)         at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:30)         at com.datastax.oss.driver.internal.core.session.DefaultSession.execute(DefaultSession.java:230)         at com.datastax.oss.driver.api.core.cql.SyncCqlSession.execute(SyncCqlSession.java:54)         at com.abc.xxxx.fieldanalyzer.FTAProcess.udpateCassandraFTAMetrics(FTAProcess.java:275)         at com.abc.xxxx.fieldanalyzer.FTAProcess.storeResults(FTAProcess.java:216)         at com.abc.xxxx.fieldanalyzer.FTAProcess.startProcess(FTAProcess.java:199)
        at com.abc.xxxx.fieldanalyzer.Main.main(Main.java:20)

FTAProcess like 275 is:

ResultSet rs = session.execute(getFieldCounts.bind().setString(0, rb.getSource()).setString(1, rb.getFieldName()));

-Joe

On 3/12/2021 8:30 AM, Paul Chandler wrote:
Hi Joe

This could also be caused by the replication factor of the keyspace, if you have NetworkTopologyStrategy and it doesn’t list a replication factor for the datacenter datacenter1 then you will get this error message too.

Paul

On 12 Mar 2021, at 13:07, Erick Ramirez <erick.rami...@datastax.com <mailto:erick.rami...@datastax.com>> wrote:

Does it get returned by the driver every single time? The NoNodeAvailableExceptiongets thrown when (1) all nodes are down, or (2) all the contact points are invalid from the driver's perspective.

Is it possible there's no route/connectivity from your app server(s) to the 172.16.x.xnetwork? If you post the full error message + full stacktrace, it might provide clues. Cheers!


<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Virus-free. www.avg.com <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Reply via email to