Hi guys,

I'm having some trouble while using paged queries on Cassandra's Java
driver (version 3.3.2). I'm using Cassandra 3.11.0.

I have to fetch a page of data from the DB, then make some trivial changes,
and then update these rows.

A simplified version of the code i'm running would be:

                    Integer rowCounter = 0;
                    Statement selectQuery =
QueryBuilder.select()...setFetchSize(pageSize)...;
                    ResultSet result =
cassandraSession.execute(selectQuery);
                    List<MyClass> mappedResults = new ArrayList<>();
                    for (Row row : result) {
                        rowCounter++;
                        mappedResults.add(map(row));
                        if (rowCounter % pageSize == 0) {

                            List<MyClass> resultsToUpdate =
modifyData(mappedResults);
                            for (MyClass resultToUpdate : resultsToUpdate){
                                Statement updateQuery =
QueryBuilder.update(KEYSPACE, tableName)...;
                                cassandraSession.execute(query);
                            }
                            TimeUnit.SECONDS.sleep(sleepSeconds); //Sleep
for a few seconds to let the DB... breathe
                        }

                    }

I'm using consistency level ONE on both select and update queries, the
value of sleepSeconds is 5 and the pageSize is 47.

My problem is that I have to use very small page sizes, otherwise queries
start to timeout on Cassandra.

There is only one thread running this long update process, but when i check
Cassandra's debug.log, it looks like this:

...
DEBUG [ScheduledTasks:1] 2018-01-26 12:43:09,221 MonitoringTask.java:173 -
55 operations were slow in the last 5001 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 1940709131428868672 AND token(id) <= 1976881771356013545 LIMIT
47>, time 1232 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -31240603717813337 AND token(id) <= 93066413544676618 LIMIT
47>, time 672 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -2746601914911102981 AND token(id) <= -2679503374406295369
LIMIT 47>, time 722 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 8697901506577253630 AND token(id) <= 8756251242481074941 LIMIT
47>, time 1737 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -2566441277217350674 AND token(id) <= -2410488306633473620
LIMIT 47>, time 997 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 5186947162422827855 AND token(id) <= 5251256039266177164 LIMIT
47>, time 1619 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 523415566358416448 AND token(id) <= 558165594730430519 LIMIT
47>, time 793 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -6313110054894254305 AND token(id) <= -6149701678898756666
LIMIT 47>, time 510 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 133117363640100699 AND token(id) <= 326755086351479456 LIMIT
47>, time 594 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -5773756298752768296 AND token(id) <= -5672224259310839216
LIMIT 47>, time 631 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 9138868762246577790 AND token(id) <= 9184809921750217730 LIMIT
47>, time 1680 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 1481347618188085389 AND token(id) <= 1529429375374220120 LIMIT
47>, time 1337 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -3179570044050246190 AND token(id) <= -2975237200717735765
LIMIT 47>, time 773 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -1992364373944487162 AND token(id) <= -1754930707218513982
LIMIT 47>, time 793 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 7461256765584395144 AND token(id) <= 7513523865647503158 LIMIT
47>, time 1569 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 2199511646454841639 AND token(id) <= 2235092311035533306 LIMIT
47>, time 1157 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 5981009014177068366 AND token(id) <= 6193847522724693984 LIMIT
47>, time 1549 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 6587824379305518475 AND token(id) <= 6941621185441223079 LIMIT
47>, time 1491 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -2888016351766682341 AND token(id) <= -2832466742668731344
LIMIT 47>, time 642 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 4599678137499867302 AND token(id) <= 4681791682494977137 LIMIT
47>, time 1222 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 4097891947569113599 AND token(id) <= 4205652216148641874 LIMIT
47>, time 1421 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 1313069522772322867 AND token(id) <= 1345209653063462051 LIMIT
47>, time 1042 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 687538582456175035 AND token(id) <= 714387656474794527 LIMIT
47>, time 1262 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 3166906839309766962 AND token(id) <= 3220931935607646481 LIMIT
47>, time 1589 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 5661754624946584572 AND token(id) <= 5764445628939515857 LIMIT
47>, time 1491 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 6988693948300673349 AND token(id) <= 7029878484744478913 LIMIT
47>, time 1579 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -601653023540100942 AND token(id) <= -570711602903916229 LIMIT
47>, time 1032 msec - slow timeout 500 msec/cross-node
... (5 were dropped)
DEBUG [ScheduledTasks:1] 2018-01-26 12:43:14,222 MonitoringTask.java:173 -
52 operations were slow in the last 4998 msecs:
...


I can't understand why my process is making lots of SELECT queries in just
5 seconds, when it should be making one fetch, updating the rows, and then
sleeping for five seconds. If i check my application logs, everything seems
to be correct, there is only one fetch every five seconds.

I tried running the same code on another environment (with the same volume
of data). It runs much faster and the debug.log looks correct:
...
DEBUG [ScheduledTasks:1] 2018-01-26 12:32:25,574 MonitoringTask.java:173 -
1 operations were slow in the last 5003 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 AND
token(id) > token(AVrJ6PqGIUe_WZqLXGLv) LIMIT 200>, time 517 msec - slow
timeout 500 msec/cross-node
DEBUG [ScheduledTasks:1] 2018-01-26 12:32:35,575 MonitoringTask.java:173 -
1 operations were slow in the last 5000 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 AND
token(id) > token(AVrJ6GJqVlin-KDUvTKL) LIMIT 200>, time 680 msec - slow
timeout 500 msec/cross-node
DEBUG [ScheduledTasks:1] 2018-01-26 12:32:40,577 MonitoringTask.java:173 -
1 operations were slow in the last 5001 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 AND
token(id) > token(AVrJ6xVrVlin-KDUvXep) LIMIT 200>, time 647 msec - slow
timeout 500 msec/cross-node
...

Any thoughts on why is this happening would be highly appreciated.

Thanks in advance.

Reply via email to