date:20220208

Re: Query timed out after PT2M

2022-02-08 Thread Joe Obernberger

Update - the answer was spark.cassandra.input.split.sizeInMB. The default value is 512MBytes. Setting this to 50 resulted in a lot more splits and the job ran in under 11 minutes; no timeout errors. In this case the job was a simple count. 10 minutes 48 seconds for over 8.2 billion rows. Fa

Re: Query timed out after PT2M

2022-02-08 Thread Joe Obernberger

Update - I believe that for large tables, the spark.cassandra.read.timeoutMS needs to be very long; like 4 hours or longer. The job now runs much longer, but still doesn't complete. I'm now facing this all too familiar error: com.datastax.oss.driver.api.core.servererrors.ReadTimeoutException: