On Tue, May 31, 2016 at 3:14 PM, Darren Govoni <dar...@ontrenet.com> wrote: > Well that could be the problem. A SQL database is essential a big > synchronizer. If you have a lot of spark tasks all bottlenecking on a single > database socket (is the database clustered or colocated with spark workers?) > then you will have blocked threads on the database server.
Totally agree this could be a big killer to scaling up, we are planning to migrate. But in the meantime we are seeing such big issues with test data of only a few records (1, 2, 1024 etc.) produced to Kafka. Currently the database is NOT busy (CPU, memory and IO usage from the DB is tiny). --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org